Scientists at the University of California, in a remarkable research, have developed a device that can decode signals from the brain’s speech centers to produce speech through a synthesizer. The study entitled“Speech synthesis from neural decoding of spoken sentences”,was published in the journal Nature.
Previous attempts to artificially translate brain activity into speech have mostly focused on understanding how speech sounds are represented in the brain, which mostly had shown limited success. However the current study targeted the brain areas that send the instructions needed to coordinate the sequence of movements of the tongue, lips, jaw and throat during speech.
“We reasoned that if these speech centres in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals,” said Gopala Anumanchipalli, a speech scientist at UCSF and the paper’s first author to The Guardian.
The team enrolled five participants who were about to undergo neurosurgery for epilepsy. In preparation for the operation, doctors temporarily implanted electrodes in the brain to map the sources of the patients’ seizures. While the electrodes were in place, the patients were asked to read several hundred sentences aloud. The scientists at the same time recorded activity from a brain area known to be involved in speech production.
The researchers aimed to decode the speech using a two-step process. First process involved translating electrical signals in the brain to vocal movements and then translating those movements into speech sounds.
The scientists already had access to a large data connecting vocal movements to speech sounds which were compiled in previous studies.
They then trained a machine learning algorithm to be able to match patterns of electrical activity in the brain with the vocal movements this would produce movements such as pressing the lips together, tightening vocal cords and shifting the tip of the tongue to the roof of the mouth. They describe the technology as a “virtual vocal tract” that can be controlled directly by the brain to produce a synthetic approximation of a person’s voice.
The scientists enrolled listeners who were reported to be able to readily interpret vocabulary by identifying and transcribing speech synthesized from cortical activity. In one test they were given 100 sentences and a pool of 25 words to select from each time, including target words and random ones. The Guardian reported that the listeners could transcribe the sentences perfectly 43% of the time.
Furthermore, the study also reported that the decoder could synthesize speech when a participant silently mimed sentences.
These findings advance the clinical practicability of using speech neuroprosthetic technology to restore spoken communication.
“For the first time … we can generate entire spoken sentences based on an individual’s brain activity,” said Edward Chang, a professor of neurological surgery at the University of California San Francisco (UCSF) and the senior author of the work. “This is an exhilarating proof of principle that, with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.”