Overview
- Researchers report a proof-of-concept in Science Advances that decodes brain activity into descriptive text for six participants scanned with fMRI.
- Participants watched 2,196 short videos during scanning, then later recalled them in a separate memory task under the same imaging setup.
- Perception outputs matched correct captions at about 50% accuracy against candidate sets, and top individuals reached nearly 40% in recall identification from 100 options.
- The pipeline decodes semantic features aligned to captions processed by DeBERTa-large and iteratively refines text using a masked language model based on RoBERTa-large.
- Authors highlight potential communication uses while stressing constraints such as coarse fMRI signals, limited sample size, variable individual performance, and mental-privacy and consent requirements.