A translator in your ear
The Consumer Electronics Show (CES) 2018 closed its doors in Las Vegas a few weeks ago. Over 3,900 exhibitors displayed the global trends in the field of innovation at the largest technological fair of the year. The development of artificial intelligence in the field of languages and wearable technology particularly caught our attention. One of the latest developments is about to become a phenomenon: smart headsets. They are wireless headsets that instantly translate (in one or two seconds) what the person you are talking to is saying in another language, generally via the Internet. They are the reflection of a globalised world with ever-growing communications and mobility, in which the field of interpreting must respond to a growing demand. But, does it not all sound a bit too good to be true? A brief look into the world of these smart headsets has enabled us to discover their incredible achievements and the long road that they still have ahead of them...
Indice de contenidos
Index of contents
Index du contenu
Promotional video “Mars, truly innovative Wireless earbuds”
We must admit that before we started looking into the concept of these headsets, we thought that they would be more or less the same as a machine translator as they face the same problems. A machine translator, while useful, can lead to many errors. However, it is even more difficult to imagine such a device due to the number of technologies that it implies: on top of creating a reliable translation tool, it is also necessary to add a voice recognition system, create a voice synthesis and design a small, attractive and ergonomic device.
The concept is as follows: a headset (whether wireless or not) that simultaneously and automatically translates a conversation in another language. In some ways, a personal interpreter. Therefore, two people who are wearing the headset can have a conversation in their respective languages with no language barriers. In order to achieve such technological results, in the case of certain headsets such as Pilot, it is necessary to download a mobile application that mediates between the headset and the machine translation tool.
The available languages to translate to or from depend on the model. LINE Mars, presented at the CES in January, has ten languages. Pilot Translation Earpieces, by Waverly Labs, understands English, French, Italian, Portuguese, Spanish, Arabic, Mandarin Chinese, German, Greek, Hindi, Japanese, Korean, Polish, Russian and Turkish. As for Google Pixel Buds… it translates to and from over 40 languages.
Promotional video “Pilot Translation Earpieces, Waverly Labs”
According to Nicholas Ruiz, researcher in the field of translation at Waverly Labs, language recognition systems can recognise 90% of what people are saying. But the researcher warns: during sensitive moments when the translation requires a higher degree of precision, machines are not reliable. That leads us to one fundamental question: What is the translation procedure adopted for such headsets?
Translation is carried out in three parts: voice recognition, machine translation and voice synthesis. Automatic voice recognition transcribes the speech captured by the microphone in words. These words are then translated thanks to a machine which operates according the techniques and algorithms created within the field of deep learning, just like the recent programme DeepL. The translated words are then converted into sound thanks to a voice synthesizer which seeks to imitate the natural melodic curve of speakers. In the case of LINE Corporation’s Mars earbuds, the translation is performed by the translator Papago, by the company Naver, an alternative to Google Translator that also uses predictive algorithms.
Google Pixel Buds use the database stored in Google Translator. It is increasingly reliable given that it is perfected on a daily basis with new contents uploaded to the website: books, documents belonging to international organisations and websites. Google compiles millions of translated texts (done by human translators) and the system aligns the texts and interprets how everything is translated by establishing patterns. It is a matter of pure statistics, but with daily experience of one billion translations, it works “more or less” correctly. This tool was designed without the help of any linguists, relying only upon machines. In order to simplify the process and make it more cost-effective, each language is first translated into English before being translated into the other languages, hence the many grammatical mistakes between, for example, the different romance languages.
Promotional Image Google Pixel Buds
In order to design these devices, of which the first models were put on sale in 2017, the creators have taken advantage of advances in language recognition and machine translation, but they have also faced several challenges.
The first difficulty that they have had to address is being able to clearly record sentences, avoiding background noise which confuses language recognition, leading to an incorrect translation. The majority of these headsets include microphones with noise reduction but even so, the headsets often pick up on background noise and translate it at the same time as whatever the speaker is saying.
Another problem for users is that they need a permanent WiFi connection, otherwise the app for translation cannot perform its role. However, in a foreign country, where users have greater translation needs, it is more difficult to gain access to a direct and permanent WiFi connection compared to a 3G or 4G connection.
In certain cases, such as that of Pilot Translation Earpieces, each user has to download the app to use the earpiece. This is not very realistic in a foreign country where you would have to pass your earpiece from one person to another (without entering into the hygienic aspect) and ask them all to download the app on their mobile in order to converse with you.
Furthermore, there is also the problem of the quality of the translation. Machine translation software has made fantastic progress in recent years, but it just cannot handle certain types of content. Cultural references, idioms, regionalisms, humour or jokes, expressions, etc. are all extremely difficult to portray. Likewise, a machine translator cannot provide a valid translation for poetry or a novel in which every word is weighed and the melody is as important as the sense. Intonation is a fundamental part of speech and it is still not possible to imitate this with a machine.
Obtaining a quality and reliable linguistic service when travelling or to converse with natives of other languages is the key to success of these products. Don’t bite off more than you can chew tends to be the golden rule, even for human translators. In this respect, there is another wearable translator, ili, which has a more limited function. This tiny object, which can be worn around the neck or hand-held, translates without an Internet connection and contains the most common phrases for travelling. A linguistic laboratory is responsible for developing the translation database, ensuring a reliable translation. The most interesting aspect of this object is that it is perfectly clear regarding its own limits: questions must be simple, sentences short and colloquial or technical words should be avoided.
The idea of substituting a human interpreter by wearable technology is far from becoming a reality. It is necessary to improve the headsets a great deal in order to provide convincing and effective experience to users, but the first steps have already been taken.