AI translator for a widely spoken language has been developed by Meta
This week, Meta stated that it has developed an artificial intelligence (AI)-powered speech-to-speech translation system that can translate from and into English predominantly oral languages that don’t have a widely used writing system. The system will initially translate from and into Hokkien, a Taiwanese language that lacks a regular written form.
Facebook CEO Mark Zuckerberg posted a video on Wednesday showing the technology in action, accompanied by software developer Peng-Jen Chen. Their conversation is translated into audible English from Hokkien by Meta’s AI system. The demo looks quite good, but it’s possible that the video was modified for demonstrative purposes, and the final product isn’t as polished as what was shown.
The system is the outcome of extensive study conducted by Meta AI teams all across the world, particularly in Israel, where Meta established a sizable R&D operation—the largest outside of the United States.
Facebook, Instagram, and WhatsApp’s parent company in Silicon Valley has been working on a project called Universal Speech Translator to let people communicate with one another regardless of their native language. This endeavour was one of two announced in early February.
Researchers frequently use text to train translation AI systems, providing their systems with massive amounts of text to parse and analyse. However, incorporating the more than 3,000 languages that are predominantly spoken and have no widely used written system into such teaching is challenging.
Such a language is Hokkien, for example. Hokkien is spoken by roughly 45 million people in Mainland China, Taiwan, Malaysia, Singapore, and the Philippines, however there is no standardised written form of the language.
This leads to a great deal of variety from writer to writer, as Hokkien speakers who need to write down information typically do so phonetically. Hokkien to English translation data is sparse, and so are human professional translators.
In a second, related initiative called No Language Left Behind, Meta says it is developing an advanced AI model that “can learn from languages with fewer examples to train from, and we will utilise it to provide expert-quality translations in hundreds of languages, ranging from Asturian to Luganda to Urdu.”
Meta’s declared long-term goal is to create language tools and machine translation systems that work for “most of the world’s languages.” These two projects are a part of that endeavour.