Metavers: how Meta wants to translate all the languages ​​of the world, including those that are not written

For the first time, an unwritten language will be able to be translated automatically: on Wednesday, the boss of Meta, Mark Zuckerberg, announced that Hokkien, a language widely spoken in China, an official language in Taiwan, but which has no widespread written transcription (unlike Mandarin, for example), could be translated by a voice translation system powered by artificial intelligence.

Little data to train the AI ​​on dialects and oral languages

More than half of the more than 7,000 living languages ​​in the world are primarily oral and do not have a standard or widely used writing system: it is therefore impossible to create translation tools using standard techniques“, explains Meta. Indeed, most of the current systems, to ensure an instant voice translation, go through a written transcription – and it is this transcription that is translated, then spoken again. All dialects and only oral languages ​​were therefore excluded from this possibility of translation.

With this support for Hokkien, the company (which for the past year has reoriented its activity) is launching its own translation project, called Universal Speech Translator (UST). For this Chinese dialect, the challenge was, according to Meta, to collect enough data: “There are few resources to train artificial intelligence compared to languages ​​like English or Spanish. In addition, there are quite a few (human) English-Hokkien translators, which complicates the compilation and annotation of data.“, explains the firm.

Sound converted into waves, not just text

Meta researchers are therefore experimenting with ways to avoid having to go through a written transcription for translation. The process is called “speech-to-unit”, which could be translated as “speech to phoneme”, as opposed to “speech-to-text”. It consists of “convert speech into a sequence of sound phonemes [correspondant aux syllabes, NDLR] then generating waveforms from these phonemes“. These waveforms, sound spectra, correspond to words, and they are the ones that are translated.

To refine the translation, explains the company, the “speech-to-unit” is doubled by a “speech-to-text” in the closest known language, in this case, in the case of Hokkien, the Mandarin, since the two languages ​​have words and turns in common. Concretely, when a user pronounces a sentence in Hokkien, a first algorithm will try to transcribe it into written Mandarin to obtain a first translation, incomplete but based on an already well-established technology, and a second will go directly through the syllables and their waveform to get the full translation.

A tool built for the metaverse

The researchers finally added to this devices for evaluating the translation obtained, using Tâi-lo, a Taiwanese spelling system. The goal of Meta, from this first translation device between Hokkien and English, is to design a model that will be applicable to all the languages ​​of the worldand especially those which, like the very many dialects, are exclusively oral.

A desire directly associated with the company’s ambitions for the future, since it has everything bet on the development of the metaverse, this Internet converted into a three-dimensional space in which Internet users will evolve thanks to an avatar and will be able to communicate. Objective: to implement such a translation tool there so that everyone can communicate with others, exactly as is done today in writing on traditional websites.

