Functional magnetic resonance imaging, or fMRI, is one of the most advanced tools for understanding how we think. As a person in an fMRI scanner completes various mental tasks, the machine produces mesmerizing and colorful images of their brain in action.
Looking at someone’s brain activity this way can tell neuroscientists which brain areas a person is using but not what that individual is thinking, seeing or feeling. Researchers have been trying to crack that code for decades—and now, using artificial intelligence to crunch the numbers, they’ve been making serious progress. Two scientists in Japan recently combined fMRI data with advanced image-generating AI to translate study participants’ brain activity back into pictures that uncannily resembled the ones they viewed during the scans. The original and re-created images can be seen on the researchers’ website.
“We can use these kinds of techniques to build potential brain-machine interfaces,” says Yu Takagi, a neuroscientist at Osaka University in Japan and one of the study’s authors. Such future interfaces could one day help people who currently cannot communicate, such as individuals who outwardly appear unresponsive but may still be conscious. The study was recently accepted to be presented at the 2023 Conference on Computer Vision and Pattern Recognition.
The study has made waves online since it was posted as a preprint (meaning it has not yet been peer-reviewed or published) in December 2022. Online commentators have even compared the technology to “mind reading.” But that description overstates what this technology is capable of, experts say.
“I don’t think we’re mind reading,” says Shailee Jain, a computational neuroscientist at the University of Texas at Austin, who was not involved in the new study. “I don’t think the technology is anywhere near to actually being useful for patients—or to being used for bad things—at the moment. But we are getting better, day by day.”
The new study is far from the first that has used AI on brain activity to reconstruct images viewed by people. In a 2019 experiment, researchers in Kyoto, Japan, used a type of machine learning called a deep neural network to reconstruct images from fMRI scans. The results looked more like abstract paintings than photographs, but human judges could still accurately match the AI-made images to the original pictures.
Neuroscientists have since continued this work with newer and better AI image generators. In the recent study, the researchers used Stable Diffusion, a so-called diffusion model from London-based start-up Stability AI. Diffusion models—a category that also includes image generators such as DALL-E 2—are “the main character of the AI explosion,” Takagi says. These models learn by adding noise to their training images. Like TV static, the noise distorts the images—but in predictable ways that the model begins to learn. Eventually the model can build images from the “static” alone.
Released to the public in August 2022, Stable Diffusion has been trained on billions of photographs and their captions. It has learned to recognize patterns in pictures, so it can mix and match visual features on command to generate entirely new images. “You just tell it, right, ‘A dog on a skateboard,’ and then it’ll generate a dog on a skateboard,” says Iris Groen, a neuroscientist at the University of Amsterdam, who was not involved in the new study. The researchers “just took that model, and then they said, ‘Okay, can we now link it up in a smart way to the brain scans?’”
The brain scans used in the new study come from a research database containing the results of an earlier study in which eight participants agreed to regularly lay in an fMRI scanner and view 10,000 images over the course of a year. The result was a huge repository of fMRI data that shows how the vision centers of the human brain (or at least the brains of these eight human participants) respond to seeing each of the images. In the recent study, the researchers used data from four of the original participants.
To generate the reconstructed images, the AI model needs to work with two different types of information: the lower-level visual properties of the image and its higher-level meaning. For example, it’s not just an angular, elongated object against a blue background—it’s an airplane in the sky. The brain also works with these two kinds of information and processes them in different regions. To link the brain scans and the AI together, the researchers used linear models to pair up the parts of each that deal with lower-level visual information. They also did the same with the parts that handle high-level conceptual information.
“By basically mapping those to each other, they were able to generate these images,” Groen says. The AI model could then learn which subtle patterns in a person’s brain activation correspond to which features of the images. Once the model was able to recognize these patterns, the researchers fed it fMRI data that it had never seen before and tasked it with generating the image to go along with it. Finally, the researchers could compare the generated image to the original to see how well the model performed.
Many of the image pairs the authors showcase in the study look strikingly similar. “What I find exciting about it is that it works,” says Ambuj Singh, a computer scientist at the University of California, Santa Barbara, who was not involved in the study. Still, that doesn’t mean scientists have figured out exactly how the brain processes the visual world, Singh says. The Stable Diffusion model doesn’t necessarily process images in the same way the brain does, even if it’s capable of generating similar results. The authors hope that comparing these models and the brain can shed light on the inner workings of both complex systems.
As fantastical as this technology may sound, it has plenty of limitations. Each model has to be trained on, and use, the data of just one person. “Everybody’s brain is really different,” says Lynn Le, a computational neuroscientist at Radboud University in the Netherlands, who was not involved in the research. If you wanted to have AI reconstruct images from your brain scans, you would have to train a custom model—and for that, scientists would need troves of high-quality fMRI data from your brain. Unless you consent to laying perfectly still and concentrating on thousands of images inside a clanging, claustrophobic MRI tube, no existing AI model would have enough data to start decoding your brain activity.
Even with those data, AI models are only good at tasks for which they’ve been explicitly trained, Jain explains. A model trained on how you perceive images won’t work for trying to decode what concepts you’re thinking about—though some research teams, including Jain’s, are building other models for that.
It’s still unclear if this technology would work to reconstruct images that participants have only imagined, not viewed with their eyes. That ability would be necessary for many applications of the technology, such as using brain-computer interfaces to help those who cannot speak or gesture to communicate with the world.
“There’s a lot to be gained, neuroscientifically, from building decoding technology,” Jain says. But the potential benefits come with potential ethical quandaries, and addressing them will become still more important as these techniques improve. The technology’s current limitations are “not a good enough excuse to take potential harms of decoding lightly,” she says. “I think the time to think about privacy and negative uses of this technology is now, even though we may not be at the stage where that could happen.”
This article is part of an ongoing series on generative AI in medicine.