Google Translate – 10 More Languages with your Help

Cross-posted on the Inside Search Blog

Whether you're teaching yourself a new language or trying to make a new friend, Google Translate can be a powerful tool for crossing language barriers. Today, we're adding 10 languages to Translate, bringing our total number of supported languages to 90. These 10 new languages will allow more than 200 million additional people to translate text to and from their native languages. These languages are available now on translate.google.com and will roll out soon to our mobile apps and to the built-in translation functionality in Chrome.  

If it weren't for the active Translate Community participation, we wouldn't be able to launch some of these languages today. While our translation system learns from translated data found on the web, sometimes we need support from humans to improve our algorithms. We're very grateful for all the support we're getting today and we hope that together with our community, we can continue improving translation quality for the languages we support today and add even more languages in the future.

Screen Shot 2014-12-11 at 12.04.12 AM.png

Spotlight on our new languages

Africa gets more language coverage with Chichewa, Malagasy, and Sesotho:

  • Chichewa (Chinyanja) is spoken by 12 million people in Malawi and surrounding countries. It is one of 55 languages used in the greetings that now travel the galaxy on the Voyager interstellar probes.
  • Malagasy is spoken by 18 million people in Madagascar, where it is the national language. It is one of only a few languages which puts the verb first in sentences, followed by the object and then the subject.
  • Sesotho has 6 million native speakers. It is the national language of Lesotho and one of 11 official languages in South Africa.

In India and Southeast Asia, we are adding Malayalam, Myanmar, Sinhala, and Sundanese:

  • Malayalam (മലയാളം), with 38 million native speakers, is a major language in India and one of that country’s 6 classical languages. It’s been one of the most-requested languages, so we are especially excited to add Malayalam support!
  • Myanmar (Burmese, မြန်မာစာ) is the official language of Myanmar with 33 million native speakers. Myanmar language has been in the works for a long time as it's a challenging language for automatic translation, both from language structure and font encoding perspectives. While our system understands different Myanmar inputs, we encourage the use of open standards and therefore only output Myanmar translations in Unicode.
  • Sinhala (සිංහල) is one of the official languages of Sri Lanka and natively spoken by 16 million people. In September the local community in Sri Lanka organized Sinhala Translate Week, and since then, participants have contributed tens of thousands of translations to our system. We're happy to be able to release Sinhala as one of the new languages today!
  • Sundanese (Basa Sunda) is spoken on the island of Java in Indonesia by 39 million people. While Sundanese does have its own script, it is today commonly written using the Latin alphabet, which is what our system uses.

In Central Asia, we are adding Kazakh, Tajik, and Uzbek:

  • Kazakh (Қазақ тілі) with 11 million native speakers in Kazakhstan. We've received strong support from Kazakh language enthusiasts, and we hope to continue collaborating with the local communities in the region to add even more languages in the future, including Kyrgyz.
  • Tajik (Тоҷикӣ), a close relative to modern Persian, is spoken by more than 4 million people in Tajikistan and beyond.
  • Uzbek (Oʻzbek tili) is spoken by 25 million people in Uzbekistan. In addition to receiving Uzbek community support, we've incorporated the Uzbek dictionary by Shavkat Butaev into our system.

We’re just getting started with these new languages and have a long way to go. You can help us by suggesting your corrections using "Improve this translation" functionality on Translate and contributing to Translate Community.

Posted by the Google Translate engineering team