Researchers from The Zuckerman Institute developed a deep neural network model that directly estimates the parameters of a speech synthesizer from all neural frequencies
A team of researchers from The Zuckerman Institute at Columbia University developed a system that translates thought into intelligible and recognizable speech. The new technology can monitor a person’s brain activity and reconstruct the words the person hears with unprecedented clarity. The technology relies on the power of speech synthesizers and Artificial Intelligence (AI) and can lead to new ways for computers to communicate directly with the brain. The technology can also prove helpful for people suffering from amyotrophic lateral sclerosis (ALS) or recovering from stroke. The research was published in Scientific Reports on January 29, 2019.
The team used a vocoder, a computer algorithm that can synthesize speech after being trained on recordings of people talking. To teach the vocoder to interpret to brain activity, the team asked epilepsy patients already undergoing brain surgery to listen to sentences spoken by different people. The team measured patterns of brain activity while the patients listened and the neural patterns were used to train the vocoder. In the next stage of the research, the team asked the same patients to listen to speakers that recited digits between 0 and 9. The team recorded the brain signals and ran them through the vocoder.
The team used neural networks to analyze and clean the sound produced by the vocoder in response to those signals. The approach yielded in a robotic-sounding voice that recites a sequence of numbers. The team then tasked individuals to listen to the recording and report what they heard in order to test the accuracy of the recording. The team found that people were able to understand and repeat the sounds around 75% of the time. Moreover, the improvement in intelligibility was especially evident when the team compared the new recordings to the earlier, spectrogram-based attempts. The team plans to test more complicated words and sentences and is focused on running the same tests on brain signals emitted when a person speaks or imagines speaking.
Subscribe to our newsletter to get notification about new updates,information, discount, etc..