Scientists Made a Mind-Bending Discovery About How AI Actually Works

Researchers are starting to unravel one of the biggest mysteries behind the AI language models that power text and image generation tools like DALL-E and ChatGPT.

For a while now, machine learning experts and scientists have noticed something strange about large language models (LLMs) like OpenAI’s GPT-3 and Google’s LaMDA : they are inexplicably good at carrying out tasks that they haven’t been specifically trained to perform. It’s a perplexing question, and just one example of how it can be difficult, if not impossible in most cases, to explain how an AI model arrives at its outputs in fine-grained detail.

In a forthcoming study posted to the arXiv preprint server, researchers at the Massachusetts Institute of Technology, Stanford University, and Google explore this “apparently mysterious” phenomenon, which is called “in-context learning.” Normally, to accomplish a new task, most machine learning models need to be retrained on new data, a process that can normally require researchers to input thousands of data points to get the output they desire—a tedious and time-consuming endeavor.

Blog