In the midst of a conversation with an acquaintance, your brain might skip ahead, anticipating the words that the other person will say. Perhaps then you will blurt out whatever comes to mind. Or maybe you will nurse your guess quietly, waiting to see if—out of all the hundreds of thousands of possibilities—your conversational partner will arrive at the same word you have been thinking of. Amazingly, your companion will often do so.
How does the brain do this? Figuring out how we process language has long been a focus for neuroscientists. Massachusetts Institute of Technology researchers brought a new take to the question using a technique called integrative modeling. They compared dozens of machine-learning algorithms called neural networks to brain scans and other data showing how neural circuits function when a person reads or listens to language. The researchers had a two-part goal: they wanted to figure out how the brain processes language and in doing so push the boundaries of what machine-learning algorithms can teach us about the brain.
The modeling technique reveals that a key role may be played by next-word prediction, which is central to algorithms such as those that suggest words as you compose your texts and e-mails. The researchers discovered that models that excel at next-word prediction are also best at anticipating brain activity patterns and reading times. So it seems like these models are not just useful for proposing the word “want” after you have typed “do you”—or for allowing computers to complete any number of tasks from behind the scenes. They may also offer a window into how your brain makes sense of the flood of words coming out of your friend’s mouth.
The researchers contend that this study marks the first time that a machine-learning algorithm has been matched to brain data to explain the workings of a high-level cognitive task. (Neural networks have been used in visual and auditory research for years.) The finding suggests that predictive processing is central to how we comprehend language and demonstrates how artificial neural networks can offer key insights into cognition.
“I did not think this would happen in my lifetime,” says Evelina Fedorenko, a cognitive neuroscientist at M.I.T. and co-senior author of the paper. “These models fare much better than I would have predicted, relative to human neural data. That just opens up all sorts of doors.”
The researchers examined models based on 43 artificial neural networks—a technology that consists of thousands or millions of interconnected nodes, similar to neurons in the brain. Each node processes data and feeds them to other nodes. Some of the models the M.I.T. team looked at were optimized for next-word prediction, including the well-known Generative Pre-trained Transformer (GPT-2), which has made an impression because of its ability to create humanlike text.
The researchers found that the activity of the neural network nodes was similar to brain activity in humans reading text or listening to stories. They also translated the neural networks’ performance into predictions of how brains would perform—such as how long it would take them to read a certain word.
The work lays a foundation for studying higher-level brain tasks. “We view this as a sort of a template or a guideline of how one can take this entire approach of relating models to data,” says Martin Schrimpf, a Ph.D. student in brain and cognitive sciences at M.I.T. and lead author of the paper.
The researchers found that the models that were best at guessing the next word were also best at predicting how a human brain would respond to the same tasks. This was especially true for processing single sentences and short paragraphs. The models were significantly worse at predicting words or human responses when it came to longer blocks of text. None of the other tasks reflected what was going on in the brain. The authors argue this is strong evidence that next-word prediction, or something like it, plays a key role in understanding language. “It tells you that, basically, something like optimizing for predictive representation may be the shared objective for both biological systems and these in silico models,” Fedorenko says.
To this point, Dan Yamins, a computational neuroscientist at Stanford University, who was not involved with the research, remarks, “A sort of convergent evolution has happened between the engineering and the real biological evolution.” Computer science has independently come up with a solution for a sophisticated cognitive task that the brain had devised over many millennia.
“I’m super impressed by what [the M.I.T. team] achieved,” says Noah Goodman, a psychologist at Stanford, who was also not involved with the research. But he adds that he suspects that the data are not sufficient to explain how people derive meaning from language. Despite these reservations, Goodman says the method is “still vastly better than anything we’ve had in the past.”
While neural networks and computational science more generally are only rough analogies for the brain, their role in helping us understand our own mind may be substantial. Integrative modeling used by Fedorenko and her colleagues demonstrates that neural networks and computational science might, in fact, be critical tools in providing insight into the great mystery of how the brain processes information of all kinds.