Powered by what Meta CEO Mark Zuckerberg calls “one of the fastest supercomputers in the world,” the company’s latest artificial intelligence model is now able to translate 200 different languages, including many that lack resources and are not supported by current translation systems.
The company has named the project “No Language Left Behind” and aims to achieve at least 25 billion translations per day between Meta’s apps.
Although there are more than 7,100 known languages in the world today, many of them lack sufficient data sets to train artificial intelligence. These so-called low-resource languages include Egyptian Arabic, Balinese, Sardinian, Nigerian Fula, Bansiran, Mbandu, and many others. These languages are spoken by quite a few people, but not many on the Internet.
“The AI modeling technology we use is helping to achieve high-quality translations in these languages spoken by billions of people around the world.” Meta CEO Mark Zuckerberg said in a statement posted on Facebook.
The company said the model can translate 55 African languages “with high quality.
“To get a sense of how big this project is, this model of 200 languages has more than 50 million parameters, and we trained it on one of the world’s fastest supercomputers, Research SuperCluster (RSC). “
“These advances have allowed our App to perform over 25 billion translations per day.”
“Cross-language communication is one of the superpowers of AI, but as we continue to push the envelope on AI, the work we do to push the most interesting content on Facebook and Instagram, recommend the most relevant ads, and ensure the security of the service for all users is advancing along with it. “
“This means that the impact of this technology will reach billions of people around the world, allowing them to communicate in their own language.” Meta AI research scientist Marta R. Costanza notes in a promotional video for the project.
Meta AI User Researcher Al Youngblood also said, “This is going to change the way people live, the way they do business, and the way they get educated. Everything the ‘No Language Left Behind’ project does is centered on this mission and is truly human-centered.”
To carry out this project, the tech giant first needed to conduct exploratory interviews with native speakers of low-resource languages to understand their translation needs. Then a computer model was developed and the model was trained with data collected using data mining techniques customized for low-resource languages.
“Crucially, we evaluated the translation performance of more than 40,000 translation directions using Flores-200, a benchmark dataset for human translators.” The research team noted in the abstract of the paper explaining the AI model.
The researchers also noted that the project will cover more low-resource languages, thereby reducing digital inequality.
“As the ‘No Language Left Behind’ project aims to reduce the global digital divide, more and more low-resource languages will be included in the scope of the project in the future.”