Library Guides: AI in Education for Students: How Generative AI works

How Generative AI works?

Generative Artificial Intelligence, or Generative AI, is a class of computer algorithms able to create digital content – including text, images, video, music and computer code. They work by deriving patterns from large sets of training data that become encoded into predictive mathematical models, a process commonly referred to as ‘learning’. Generative AI models do not keep a copy of the data they were trained on, but rather generate novel content entirely from the patterns they encode. People can then use interfaces like ChatGPT or MidJourney to input prompts – typically instructions in plain language – to make generative AI models produce new content.

As the development of practical and high-quality generative AI emerges, it can become a helpful tool for our everyday work and has the potential for diverse applications such as art, writing, and software development.

The core of a generative AI is a trained deep-learning model that understands and generates text, image, or other media in a human-like fashion based on a given user input, i.e. prompt. This model is trained on massive amounts of data to learn from patterns in the data. For example, it would learn that certain words tend to follow others, or that certain phrases are more common in certain contexts. The model uses the prompt to produce a completion, which is then presented back to users.

The video below provides a simple explanation of the mechanism of generative AI.

The quality of the generated output depends on several factors, including the amount and quality of the training data, the prompt's complexity, and the model's size. Larger models usually generate better output but require more computing power and resources. Notable examples of generative AI systems include ChatGPTLinks and Bard, which focus on language generation, and Midjourney and DALL-EL which focus on image generation.

Some everyday applications of generative AI

Predictive text

This technology facilitates typing on a device by suggesting words the user may wish to insert in a text field. The below example shows that predictive text suggests the word "you" to be inserted behind "Good morning, how are".

Screenshot of a screen keyboard showing autocomplete suggestions

Image style transfer

This technology that generates a new image by combining the content of one image with the style of another image. The below example is a generated image (using Bing Image Creator) with the content of the painting "Mona Lisa" and the style of "Starry Night".

Mona Lisa in the style of Starry Night

Copyright© The University of Sydney. Unless otherwise indicated, 3rd party material has been reproduced and communicated to you by or on behalf of the University of Sydney in accordance with section 113P of the Copyright Act 1968 (Act). The material in this communication may be subject to copyright under the Act. Any further reproduction or communication of this material by you may be the subject of copyright protection under the Act. Do not remove this notice.