As artificial intelligence (AI) tools are becoming widely adopted by professionals and students alike, educators are realizing that they need to better understand text and image generators like ChatGPT and DALL-E.
On a rainy Thursday in October, Vered Shwartz—Assistant Professor in the UBC Department of Computer Science—kicked off a lecture series that caters to these needs, called AI in Education: Promises and Pitfalls.
Today’s AI tools produce results that are significantly higher in quality compared to just a couple of years ago. Text generators understand queries and produce content that is more coherent than the average human could write themselves. Image generators create photorealistic content from a text description. Their advancement and subsequent popularity has prompted questions of academic integrity, since students are able to use these tools to complete assignments, with nary an original thought to contribute. Unfortunately, AI detectors are unreliable, and the technology remains an impenetrable concept to many people outside of the computer science field. Vered sought to remedy this with explanations of the generative AI training process, starting with large language models (LLMs).
The purpose of LLMs is to predict the next word in a sequence. They create probability scores based on the large amounts of data that they’ve consumed (i.e. been trained on). Vered provides an example. If an LLM were given the sequence “parrots are the most intelligent,” the word “animal” would rank relatively high among probability scores for the next word. This process of generating statistical probabilities repeats over and over again to complete a full sentence or paragraph or essay (or book!).
Given that the outcome of text generation largely relies on how the tool is trained, you might be wondering how this process works. Generative technology like ChatGPT and Bing read massive text collections, scraped from the internet. The availability of this data is part of what makes these tools possible, since text is readily accessible given the amount of content online (although some of it is copyrighted). Once trained on this text, the model begins predicting the next word in a sequence, then comparing their chosen word to the correct answer. If the model is wrong, parameters are adjusted; as feedback is continuously gathered, it becomes more accurate overtime. OpenAI, the company that owns ChatGPT, is notoriously secretive, so little is known about their process, specifically. Eventually, though, the models reach such a level of complexity that they can respond to prompts rather than simply completing the directions.
Image generating models also start with a large collection of images, but there are various ways to train the model based on these images. Vered highlighted diffusion training, in which random noise is gradually added to an image until it’s unrecognizable, and then the model is trained to restore the image by predicting the noise that it must remove. So far, the results are of limited quality, especially in regards to generating an average-looking human face. The tools are also challenged by complex and/or unrealistic prompts.
Interestingly, both types of generators, text and image, are bad at numbers—which may come as a surprise, as computers are often thought to excel at the logic of mathematics. However, due to the scope of its training, ChatGPT is very poor at doing your math homework for you, and if prompted to provide eight of something in an image, DALL-E is more often wrong than it is right. The models also exhibit the same bias that is present in their data. For example, if DALL-E is trained on more images of male scientists than female scientists, then it would be more likely to produce an image of a male given an ungendered prompt to generate a scientist. In essence, AI tools are not objective or infallible. They reinforce existing social hierarchies and power dynamics.
Towards the end of the event, the mood picked up as we turned to questions from the audience. Many educators were in the room, wondering how these generators could be used responsibly in the classroom. If ChatGPT requires specific, back-and-forth dialogue with the user to generate fitting content, doesn’t the user need to exercise some degree of critical thinking? Thus, wouldn’t students still be learning if they interacted with ChatGPT? “It depends on the goals of the assignment,” Vered said. If the goal is to have students improve their writing skills, creating sentence structures and flow that enhance readability, then ChatGPT is counterproductive. However, perhaps the instructor wants their students to use their preexisting knowledge of historical phenomena to edit ChatGPT’s output for accuracy. Text generation is useful for this circumstance, but only because it so often fails at telling the truth.
Vered Shwartz is an Assistant Professor of Computer Science at UBC and a CIFAR AI Chair at the Vector Institute. Her research is concerned with natural language processing, with the fundamental goal of building computer programs that can interact with people in natural languages. In her work, she is teaching machines to apply human-like common-sense reasoning which is required to resolve ambiguities and interpret underspecified language. Before joining UBC, Vered completed her PhD in Computer Science at Bar-Ilan University and was a postdoctoral researcher at the Allen Institute for AI and the University of Washington.
Post by: Kyla McCallum, Green College Content Writer and Resident Member.