- August 28, 2023
- allix
- Research
UC San Francisco and UC Berkeley researchers have pioneered this breakthrough technology, allowing the woman to express herself through a digital avatar. This represents a remarkable achievement as it marks the maiden instance of generating speech and facial expressions from brain signals. Moreover, the BCI system has the capability to transcribe these signals into text at an impressive speed of around 80 words per minute, a substantial advancement compared to existing commercial technologies.
Dr. Edward Chang, the Chair of Neurological Surgery at UCSF, has dedicated over a decade to the development of this brain-computer interface technology. With the recent publication of their research in Nature on August 23, 2023, the team aspires to pave the way for an FDA-approved system that enables speech synthesis from brain signals in the foreseeable future. Chang’s team had previously showcased the feasibility of translating brain signals into text in a man who had also endured a brainstem stroke several years prior. However, the current study accomplishes something more ambitious: deciphering brain signals into the nuances of speech, including the intricate facial movements that accompany conversations. By implanting a slender array of 253 electrodes on specific regions of the woman’s brain critical for speech, Chang intercepted brain signals that, if not for the stroke, would have controlled the muscles in her tongue, jaw, larynx, and face. A cable linked these electrodes to a bank of computers via a port affixed to her head.
Over the course of weeks, the participant collaborated with the research team to train the artificial intelligence algorithms of the system to recognize her distinctive brain signals for speech. This involved repetitively uttering different phrases from a vocabulary of 1,024 words until the computer could identify the brain activity patterns corresponding to these sounds. Rather than instructing the AI to identify entire words, the scientists designed a system that translates words from phonemes, the elemental speech units that form spoken words similar to how letters construct written words. For example, the word “Hello” is composed of four phonemes: “HH,” “AH,” “L,” and “OW.”
Employing this approach, the computer only needed to master 39 phonemes to decode any English word, which not only enhanced the system’s accuracy but also rendered it three times faster. “The precision, swiftness, and lexicon are of paramount importance,” said Sean Metzger, who collaborated with Alex Silva, fellow graduate students in the UC Berkeley and UCSF joint Bioengineering Program, to develop the text decoder. “This lays the foundation for users to eventually communicate almost as rapidly as natural conversations while maintaining a more authentic and ordinary dialogue.”
To generate the voice, the team devised an algorithm for synthesizing speech, customized to resemble the woman’s pre-injury voice using a recording of her speech during her wedding. Animating the avatar was achieved with assistance from software that simulates and animates facial muscle movements, provided by Speech Graphics, an AI-driven facial animation company. Researchers developed tailored machine learning processes to integrate the company’s software with the brain signals sent while the woman endeavored to speak. This resulted in translating these signals into movements on the avatar’s face, such as opening and closing the jaw, forming lips, and moving the tongue. Additionally, facial expressions like happiness, sadness, and surprise were accurately depicted.
“We’re compensating for the severed connections between the brain and vocal apparatus due to the stroke,” explained Kaylo Littlejohn, a graduate student working under Chang and Dr. Gopala Anumanchipalli, a UC Berkeley professor specializing in electrical engineering and computer sciences. “The instant the subject employed this system to articulate and synchronize the avatar’s facial motions, I sensed its potential to make a substantial impact.” An essential forthcoming stage for the team is to create a wireless version of the system, freeing users from the need for physical attachment to the BCI. “Granting individuals the freedom to operate their computers and phones using this technology could profoundly enhance their self-reliance and social interactions,” asserted co-first author Dr. David Moses, an adjunct professor in neurological surgery.
Categories
- AI Education (37)
- AI in Business (63)
- AI Projects (85)
- Research (58)
Other posts
- Youtube Develops AI Tools For Music And Face Detection, And Creator Controls For Ai Training
- Research Shows Over-Reliance On AI When Making Life-Or-Death Decisions
- The Complete List of 28 US AI Startups to Earn Over $100 Million in 2024
- Keras Model
- Scientists Develop AI Solution to Prevent Power Outages
- NBC Introduces AI-Powered Legendary Broadcaster for Olympic Games
- Runway Introduces Video AI Gen-3
- Horovod – Distributed Deep Learning with TensorFlow and PyTorch
- Research Reveals Contradictory Positions Of AI Models On Controversial Issues
- Using AI to Understand Dog Barking
Newsletter
Get regular updates on data science, artificial intelligence, machine