Donate to Science & Enterprise

S&E on Mastodon

S&E on LinkedIn

S&E on Flipboard

Please share Science & Enterprise

Natural-Sounding Speech Produced from Brain Signals

Gopala Anumanchipalli with electrode implant

Gopala Anumanchipalli holds an electrode implant used to record brain signals in the studies. (Univ of California in San Francisco)

25 Apr. 2019. A neuroscience lab created a virtual speech-generating vocal system that produces natural sounding human speech by interpreting activity in the brain. A team from University of California in San Francisco describes the technology in yesterday’s issue of the journal Nature (paid subscription required).

Researchers from the lab of neuroscientist and neurosurgery professor Edward Chang aim to develop systems for people with severe neurological conditions that prevent or impair the ability to speak, such as stroke, traumatic brain injury, Parkinson’s disease, and amyotrophic lateral sclerosis or ALS. For this task, the team led by postdoctoral researcher and computer scientist Gopala Anumanchipalli need to recreate the acoustic properties of human speech, as well as capture activity in the brain creating the desired messages, and interpret those brain signals to produce natural-sounding speech.

In a paper published last year, Chang and colleagues explained a key piece of the puzzle, detailed mechanisms in the brain that instruct vocal tract organs, such as the mouth and tongue to produce speech. The researchers call these mechanisms articulatory kinematic trajectories that produce not only abstract sounds, but also give expression to the voice.

For this earlier study. Anumanchipalli and doctoral candidate Josh Chartier asked 5 individuals with epilepsy with temporary electrode implants on their brains for tracking brain activity during later surgery to remove seizure-producing brain cells, to take part. The team asked the participants to read aloud hundreds of sentences, during which the researchers recorded their voices, then mapped their brain signals to vocal tract movements with a deep-learning algorithm.

“The relationship between the movements of the vocal tract and the speech sounds that are produced is a complicated one,” says Anumanchipalli in a university statement. “We reasoned that if these speech centers in the brain are encoding movements rather than sounds, we should try to do the same in decoding those signals.”

In the new study Chang, Anumanchipalli, and Chartier asked another 5 individuals with epilepsy, about to undergo the same kind of surgery, to again read aloud hundreds of sentences while wearing implants to record brain signals. This time, the team not only decoded articulatory kinematic trajectories in participants’ voices, but also used another machine-learning algorithm to convert these instructions for vocal organs into synthesized speech.

The researchers asked reviewers recruited through Amazon’s Mechanical Turk crowdsourcing service to identify the correct words from a list of 25 alternatives, or transcribe the synthesized spoken sentences produced by the speech synthesizer. Overall, reviewers accurately could select 7 in 10 words (69%) from the lists of alternatives and correctly transcribe more than 4 in 10 sentences (43%). Results also show reviewers more accurately identify the correct words from shorter lists of 25 alternatives than lists of 50 selections, where accurate identifications dropped to 47 percent, and accurate transcriptions declined to only 2 in 10 sentences (21%).

The team acknowledges their synthesized speech does not perfectly emulate spoken speech, but they’ve advanced the technology beyond what’s currently available. “For the first time, this study demonstrates that we can generate entire spoken sentences based on an individual’s brain activity,” says Chang adding, “with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss.”

The following video gives examples of the same sentences spoken by humans and produced with synthesized speech.

More from Science & Enterprise:

*     *     *

1 comment to Natural-Sounding Speech Produced from Brain Signals