Most aspects of life involve communicating with others—and being understood by those people as well. Many of us take this understanding for granted, but you can imagine the extreme difficulty and frustration you’d feel if people couldn’t easily understand the way you talk or express yourself. That’s the reality for millions of people living with speech impairments caused by neurologic conditions such as stroke, ALS, multiple sclerosis, traumatic brain injuries and Parkinson's.
To help solve this problem, the Project Euphonia team—part of our AI for Social Good program—is using AI to improve computers’ abilities to understand diverse speech patterns, such as impaired speech. We’ve partnered with the non-profit organizations ALS Therapy Development Institute (ALS TDI) and ALS Residence Initiative (ALSRI) to record the voices of people who have ALS, a neuro-degenerative condition that can result in the inability to speak and move. We collaborated closely with these groups to learn about the communication needs of people with ALS, and worked toward optimizing AI based algorithms so that mobile phones and computers can more reliably transcribe words spoken by people with these kinds of speech difficulties. To learn more about how our partnership with ALS TDI started, read this article from Senior Director, Clinical Operations Maeve McNally and ALS TDI Chief Scientific Officer Fernando Vieira.
Example of phrases that we ask participants to read
To do this, Google software turns the recorded voice samples into a spectrogram, or a visual representation of the sound. The computer then uses common transcribed spectrograms to "train" the system to better recognize this less common type of speech. Our AI algorithms currently aim to accommodate individuals who speak English and have impairments typically associated with ALS, but we believe that our research can be applied to larger groups of people and to different speech impairments.
In addition to improving speech recognition, we are also training personalized AI algorithms to detect sounds or gestures, and then take actions such as generating spoken commands to Google Home or sending text messages. This may be particularly helpful to people who are severely disabled and cannot speak.
The video below features Dimitri Kanevsky, a speech researcher at Google who learned English after he became deaf as a young child in Russia. Dimitri is using Live Transcribe with a customized model trained uniquely to recognize his voice. The video also features collaborators who have ALS like Steve Saling—diagnosed with ALS 13 years ago—who use non-speech sounds to trigger smart home devices and facial gestures to cheer during a sports game.
We’re excited to see where this can take us, and we need your help. These improvements to speech recognition are only possible if we have many speech samples to train the system. If you have slurred or hard to understand speech, fill out this short form to volunteer and record a set of phrases. Anyone can also donate to or volunteer with our partners, ALS TDI and the ALS Residence Initiative. The more speech samples our system hears, the more potential we have to make progress and apply these tools to better support everyone, no matter how they communicate.