Skip to Content

Scientist turns people’s mental images into text using ‘mind-captioning’ technology

<i>Tomoyasu Horikawa via CNN Newsource</i><br/>Functional magnetic resonance imaging (fMRI) is a non-invasive technique that measures brain activity. This fMRI image shows multiple horizontal views of the brain
Tomoyasu Horikawa via CNN Newsource
Functional magnetic resonance imaging (fMRI) is a non-invasive technique that measures brain activity. This fMRI image shows multiple horizontal views of the brain

By Amarachi Orie, CNN

(CNN) — A scientist in Japan has developed a technique that uses brain scans and artificial intelligence to turn a person’s mental images into accurate, descriptive sentences.

While there has been progress in using scans of brain activity to translate the words we think into text, turning our complex mental images into language has proved challenging, according to Tomoyasu Horikawa, author of a study published November 5 in the journal Science Advances.

However, Horikawa’s new method, known as “mind-captioning,” works by using AI to generate descriptive text that mirrors information in the brain about visual details such as objects, places, actions and events, as well as the relationships between them.

Horikawa, a researcher at telecommunication company NTT’s Communication Science Laboratories just outside Tokyo, began by analyzing the brain activity of four men and two women, native Japanese speakers between 22 and 37 years old, scanning their brains as they watched video clips. The participants viewed 2,180 videos without sound that were seconds long and varied in content among objects, scenes and actions.

Large language models — generative AI systems trained on large datasets — took captions of the video clips and turned those captions into sequences of numbers.

Horikawa trained separate, simpler AI models, known as “decoders,” to match the scanned brain activity related to the video clips to the numerical sequences.

He then used the decoders to interpret the study participants’ brain activity while they watched or recalled videos that the AI had not encountered during the training process. Another algorithm was created to progressively generate word sequences that best matched the decoded brain activity.

As the AI learned from the data, the descriptive text tool became better and better at using the brain scans to describe the videos the participants had watched.

“It’s just one additional step forward in the direction of what, in my view, we can legitimately call brain-reading or mind-reading,” Marcello Ienca, a professor of the ethics of AI and neuroscience at Technical University of Munich in Germany and president-elect of the International Neuroethics Society, told CNN. He was not involved in the study.

Potential for ‘profound’ health interventions

The AI model generated text in English, even though the participants were not native English speakers.

The method can create comprehensive descriptions of visual content, even without using the activity in language-related regions of the brain, or the “language network,” Horikawa said, “indicating that this method can be used even when someone has damage around that language network.”

The technology could potentially be used to assist people with aphasia, who struggle with language expression due to damage around the language network; or amyotrophic lateral sclerosis (ALS), a progressive neurodegenerative disease that affects speech, according to the study.

“I think this study paves the way for some profound interventions for people who have difficulty communicating, including non-verbal autistic people,” said psychologist Scott Barry Kaufman, a lecturer at Barnard College in New York who was not involved in the study.

However, “we have to use it carefully and make sure we aren’t being invasive and that everyone consents to it,” he told CNN.

‘The ultimate privacy challenge’

The success of this method — which could be applied to decode the thoughts of infants or animals, or the content of dreams — “raises ethical concerns” regarding privacy, with the possibility of disclosing an individual’s private thoughts before they have verbalized them, the study noted.

If in the future this technology is used by consumers beyond biomedical purposes, “I think this is the ultimate privacy challenge,” Ienca said.

He added that there are many companies, such as Neuralink, Elon Musk’s brain implant startup, that are making public claims about soon developing neural implants for the general population.

“If we get there, then we need to have very, very strict rules when it comes to granting access to people’s minds and brains,” Ienca said, highlighting that our brains include “sensitive information” such as “signatures of early dementia and psychiatric disorders and depression.”

A study published in the journal Cell in August suggested that the “leakage” of private inner thoughts during decoding could be prevented by a mechanism in which the user thinks of a particular keyword to unlock the decoding tool only when intended.

The “neuroscience is moving fast and the assistive potential is huge — but mental privacy and freedom of thought protections can’t wait,” said social scientist Łukasz Szoszkiewicz, an assistant professor at Adam Mickiewicz University in Poland and a director of European Affairs at the Neurorights Foundation in New York.

“We should treat neural data as sensitive by default, require explicit purpose-limited consent, and prioritize on-device processing with user-controlled ‘unlock’ mechanisms. Reliance on AI introduces additional regulatory and cybersecurity challenges and underscores the need for complementary, AI-specific legal framework,” Szoszkiewicz, who was not involved in the study, told CNN.

However, Horikawa noted that the method used in his study requires a large amount of data collection, with the cooperation of active participants. So, while the technology is useful for neuroscientific research, it is “not so accurate for practical use,” he said.

Also, the videos used in the study included typical scenes of, for example, a dog biting a man, but not more unusual scenes — say, a man biting a dog. Therefore, it is not yet clear whether the technique could be used to capture less predictable mental images.

As a result, “while some people may worry that this technology poses serious risk to mental privacy,” in reality, “the current approach cannot easily read a person’s private thoughts,” Horikawa said.

The-CNN-Wire
™ & © 2025 Cable News Network, Inc., a Warner Bros. Discovery Company. All rights reserved.

Article Topic Follows: CNN

Jump to comments ↓

CNN Newsource

BE PART OF THE CONVERSATION

News-Press Now is committed to providing a forum for civil and constructive conversation.

Please keep your comments respectful and relevant. You can review our Community Guidelines by clicking here.

If you would like to share a story idea, please submit it here.