Tag Archives: translate

Reducing gender bias in Google Translate

Over the course of this year, there’s been an effort across Google to promote fairness and reduce bias in machine learning. Our latest development in this effort addresses gender bias by providing feminine and masculine translations for some gender-neutral words on the Google Translate website.


Google Translate learns from hundreds of millions of already-translated examples from the web. Historically, it has provided only one translation for a query, even if the translation could have either a feminine or masculine form. So when the model produced one translation, it inadvertently replicated gender biases that already existed. For example: it would skew masculine for words like “strong” or “doctor,” and feminine for other words, like “nurse” or “beautiful.”


Now you’ll get both a feminine and masculine translation for a single word—like “surgeon”—when translating from English into French, Italian, Portuguese or Spanish. You’ll also get both translations when translating phrases and sentences from Turkish to English. For example, if you type “o bir doktor” in Turkish, you’ll now get “she is a doctor” and “he is a doctor” as the gender-specific translations.


gender specific translation

Gender-specific translations on the Google Translate website.

In the future, we plan to extend gender-specific translations to more languages, launch on other Translate surfaces like our iOS and Android apps, and address gender bias in features like query auto-complete. And we're already thinking about how to address non-binary gender in translations, though it’s not part of this initial launch.


To check out gender-specific translations, visit the Google Translate website, and you can get more information on our Google Translate Help Center page.

Source: Translate


A new look for Google Translate on the web

It’s been twelve years since the launch of Google Translate, and since then Translate has evolved to keep up with the ways people use it. Initially translating between English and Arabic only, we now translate 30 trillion sentences per year across 103 languages.

Google Translate has become an essential tool for communicating across languages, and we recently redesigned the Translate website to make it easier to use. Here’s what you need to know:

  • The site’s new look is now consistent with other Google products, and updated labeling and typography make it easier to navigate. For instance, you’ve always been able to upload documents for translation, but now that feature is easier to find. 
  • Now it’s even more convenient to save and organize important translations you regularly utilize or search for. We’ve added labels to each saved translation, so if you speak multiple languages, you can sort and group your translations with a single click.
  • We've made the website responsive so it can adjust dynamically for your screen size. So when we launch new features, you get a great web experience across all your devices: mobile, tablet, or desktop. 
translate web redesign gif

The new responsive website adjusts dynamically to your screen size.

To check out the new site, visit translate.google.com.

Source: Translate


A new look for Google Translate on the web

It’s been twelve years since the launch of Google Translate, and since then Translate has evolved to keep up with the ways people use it. Initially translating between English and Arabic only, we now translate 30 trillion sentences per year across 103 languages.

Google Translate has become an essential tool for communicating across languages, and we recently redesigned the Translate website to make it easier to use. Here’s what you need to know:

  • The site’s new look is now consistent with other Google products, and updated labeling and typography make it easier to navigate. For instance, you’ve always been able to upload documents for translation, but now that feature is easier to find. 
  • Now it’s even more convenient to save and organize important translations you regularly utilize or search for. We’ve added labels to each saved translation, so if you speak multiple languages, you can sort and group your translations with a single click.
  • We've made the website responsive so it can adjust dynamically for your screen size. So when we launch new features, you get a great web experience across all your devices: mobile, tablet, or desktop. 
translate web redesign gif

The new responsive website adjusts dynamically to your screen size.

To check out the new site, visit translate.google.com.

Source: Translate


Bringing hope to a refugee family, using Google Translate

In 2015, I joined Google to be a part of a company using technology to help others. I’m proud that Google’s commitment to its mission—to organize the world’s information and make it universally accessible and useful—remains strong 20 years in. I knew I wanted to be a part of it all, but had no idea that I would experience the power of our mission firsthand, and that it would help me to forge a friendship when I least expected it.

For the past three years, my wife and I have been working with organizations involved with refugee resettlement efforts. We both have immigrant parents, so we’ve heard stories about resettling in a country to make a better life for your children, but being forced to leave a country is very different. These refugees are often fleeing from life threatening situations. Aside from dealing with their past trauma and being in an unfamiliar place without a support system, they often can’t speak the local language.

My wife and I learned of a family of four—Nour, Mariam, three-year old Sanah and six-month-old Yousuf—who settled in Rialto, 45 minutes from where my wife and I grew up in Southern California. Through the assistance of organizations such as Hearts of Mercy and Miry’s List, they settled into an apartment shortly before giving birth to Yousuf. Still recovering from injuries sustained in Syria, Nour was unable to work, and had to rely on the help of others to get by. Without a car, their options were further limited. Then, in April of this year, they faced their hardest challenge yet: their daughter Sanah was diagnosed with Stage 4 Neuroblastoma.

We wanted to help, but didn’t know where to start—and as new parents ourselves, we could relate on a personal level. We fundraised for the family and collected toys for Yousuf and Sanah in hopes that they could feel supported. Moreso, we wanted to help them get through Sanah’s treatments with as little to worry about as possible.

A few weeks after we first heard of their story, we went to their home to meet in person. Nour was waiting outside for us, and we quickly realized there was a challenge that we had overlooked: the family only spoke Arabic. There I was, face to face with Nour, wanting to hear his story and reassure him that he’s surrounded by a supportive community, but couldn't convey those thoughts or give Nour the ability to convey his. The only option I could think of was Google Translate, which I had used in previous international trips, and hoped would bridge this gap.

I opened the app to translate a few words, but we couldn’t get far by manually typing sentences. Instead, I tried "conversation" mode, which allows for real-time audio translations and makes the interaction feel more natural. We talked about his family’s story and what they were up against. I learned that back in Syria, Nour was shot twice in the back, and endured the deaths of his brothers. Now, Nour and Mariam are giving up everything to take care of Sanah and spend up to two hours commuting on a bus to and from her hospital treatments. Through all of this, they continue to be optimistic and hopeful, and are grateful for being able to make it to America.

image (2).png

A snapshot of my visit with Nour.

I never imagined that we could sustain a 90-minute conversation in two languages, and that it would bring us closer together, inspiring me in a way I didn’t expect. Without Translate, we would have exchanged a few pleasantries, shared poorly communicated words and parted ways. Instead, we walked away with a bond built on an understanding of one another—we were just two fathers, talking about our fears and hopes for our family’s future. To this day, we stay connected on how the family is doing, and I’m looking forward to keeping this relationship going for a long time.

Refugee families often find themselves in situations that may seem normal to you and me—like at the DMV trying to get a driver’s license—or worse, in a dire situation like a hospital, with no way of communicating. We generally think of technology as an enabler of change, driving efficiency or making the impossible happen. But in this case, technology allowed me to make a life-changing connection, and brought me closer to family who was very far away from home.

Source: Translate


Bringing hope to a refugee family, using Google Translate

In 2015, I joined Google to be a part of a company using technology to help others. I’m proud that Google’s commitment to its mission—to organize the world’s information and make it universally accessible and useful—remains strong 20 years in. I knew I wanted to be a part of it all, but had no idea that I would experience the power of our mission firsthand, and that it would help me to forge a friendship when I least expected it.

For the past three years, my wife and I have been working with organizations involved with refugee resettlement efforts. We both have immigrant parents, so we’ve heard stories about resettling in a country to make a better life for your children, but being forced to leave a country is very different. These refugees are often fleeing from life threatening situations. Aside from dealing with their past trauma and being in an unfamiliar place without a support system, they often can’t speak the local language.

My wife and I learned of a family of four—Nour, Mariam, three-year old Sanah and six-month-old Yousuf—who settled in Rialto, 45 minutes from where my wife and I grew up in Southern California. Through the assistance of organizations such as Hearts of Mercy and Miry’s List, they settled into an apartment shortly before giving birth to Yousuf. Still recovering from injuries sustained in Syria, Nour was unable to work, and had to rely on the help of others to get by. Without a car, their options were further limited. Then, in April of this year, they faced their hardest challenge yet: their daughter Sanah was diagnosed with Stage 4 Neuroblastoma.

We wanted to help, but didn’t know where to start—and as new parents ourselves, we could relate on a personal level. We fundraised for the family and collected toys for Yousuf and Sanah in hopes that they could feel supported. Moreso, we wanted to help them get through Sanah’s treatments with as little to worry about as possible.

A few weeks after we first heard of their story, we went to their home to meet in person. Nour was waiting outside for us, and we quickly realized there was a challenge that we had overlooked: the family only spoke Arabic. There I was, face to face with Nour, wanting to hear his story and reassure him that he’s surrounded by a supportive community, but couldn't convey those thoughts or give Nour the ability to convey his. The only option I could think of was Google Translate, which I had used in previous international trips, and hoped would bridge this gap.

I opened the app to translate a few words, but we couldn’t get far by manually typing sentences. Instead, I tried "conversation" mode, which allows for real-time audio translations and makes the interaction feel more natural. We talked about his family’s story and what they were up against. I learned that back in Syria, Nour was shot twice in the back, and endured the deaths of his brothers. Now, Nour and Mariam are giving up everything to take care of Sanah and spend up to two hours commuting on a bus to and from her hospital treatments. Through all of this, they continue to be optimistic and hopeful, and are grateful for being able to make it to America.

image (2).png

A snapshot of my visit with Nour.

I never imagined that we could sustain a 90-minute conversation in two languages, and that it would bring us closer together, inspiring me in a way I didn’t expect. Without Translate, we would have exchanged a few pleasantries, shared poorly communicated words and parted ways. Instead, we walked away with a bond built on an understanding of one another—we were just two fathers, talking about our fears and hopes for our family’s future. To this day, we stay connected on how the family is doing, and I’m looking forward to keeping this relationship going for a long time.

Refugee families often find themselves in situations that may seem normal to you and me—like at the DMV trying to get a driver’s license—or worse, in a dire situation like a hospital, with no way of communicating. We generally think of technology as an enabler of change, driving efficiency or making the impossible happen. But in this case, technology allowed me to make a life-changing connection, and brought me closer to family who was very far away from home.

Source: Translate


Offline translations are now a lot better thanks to on-device AI

Just about two years ago we introduced neural machine translation (NMT) to Google Translate, significantly improving accuracy of our online translations. Today, we’re bringing NMT technology offline—on device. This means that the technology will run in the Google Translate apps directly on your Android or iOS device, so that you can get high-quality translations even when you don't have access to an internet connection.

The neural system translates whole sentences at a time, rather than piece by piece. It uses broader context to help determine the most relevant translation, which it then rearranges and adjusts to sound more like a real person speaking with proper grammar. This makes translated paragraphs and articles a lot smoother and easier to read.

Offline translations can be useful when traveling to other countries without a local data plan, if you don’t have access to internet, or if you just don’t want to use cellular data. And since each language set is just 35-45MB, they won’t take too much storage space on your phone when you download them.

Comparison between phrase based translation and online/offline NMT

A comparison between our current phrase-based machine translation (PBMT), new offline neural machine translation (on-device), and online neural machine translation

To try NMT offline translations, go to your Translate app on Android or iOS. If you’ve used offline translations before, you’ll see a banner on your home screen which will take you to the right place to update your offline files. If not, go to your offline translation settings and tap the arrow next to the language name to download the package for that language. Now you’ll be ready to translate text whether you’re online or not. 

Google Translate offline NMT

We're rolling out this update in 59 languages over the next few days, so get out there and connect to the world around you!

Source: Translate


Tune in for the world’s first Google Translate music tour

Eleven years ago, Google Translate was created to break down language barriers. Since then, it has enabled billions of people and businesses all over the world to talk, connect and understand each other in new ways.

And we’re still re-imagining how it can be used—most recently, with music. The music industry in Sweden is one of the world's most successful exporters of hit music in English—with artists such Abba, The Cardigans and Avicii originating from the country. But there are still many talented Swedish artists who may not get the recognition or success they deserve except for in a small country up in the north.

This sparked an idea: might it be possible to use Google Translate with the sole purpose of breaking a Swedish band internationally?

Today, we’re presenting Translate Tour, in which up and coming Swedish indie pop group Vita Bergen will be using Google Translate to perform their new single “Tänd Ljusen” in three different languages—English, Spanish and French—on the streets of three different European cities. In just a couple of days, the band will set off to London, Paris and Madrid to sing their locally adapted songs in front of the eyes of the public—with the aim of spreading Swedish music culture and inviting people all over the world to tune into the band’s cross-European indie pop music.

Translate Tour 2_Credit Anton Olin.jpg

William Hellström from Vita Bergen will be performing his song in English, Spanish and French.

Last year Google Translate switched from phrase-based translation to Google Neural Machine Translation, which means that the tool now translates whole sentences at a time, rather than just piece by piece. It uses this broader context to figure out the most relevant translation, which it then rearranges and adjusts to be more like a human speaking with proper grammar.

Using this updated version of Google Translate, the English, Spanish and French translations of the song were close to flawless. The translations will also continue to improve, as the system learns from the more people using it.

Tune in to Vita Bergen’s release event, live streamed on YouTube today at 5:00 p.m. CEST, or listen to the songs in Swedish (“Tänd Ljusen”), English (“Light the Lights”), Spanish (“Enciende las Luces”) and French (“Allumez les Lumières”).

Source: Translate


Tune in for the world’s first Google Translate music tour

Eleven years ago, Google Translate was created to break down language barriers. Since then, it has enabled billions of people and businesses all over the world to talk, connect and understand each other in new ways.

And we’re still re-imagining how it can be used—most recently, with music. The music industry in Sweden is one of the world's most successful exporters of hit music in English—with artists such Abba, The Cardigans and Avicii originating from the country. But there are still many talented Swedish artists who may not get the recognition or success they deserve except for in a small country up in the north.

This sparked an idea: might it be possible to use Google Translate with the sole purpose of breaking a Swedish band internationally?

Today, we’re presenting Translate Tour, in which up and coming Swedish indie pop group Vita Bergen will be using Google Translate to perform their new single “Tänd Ljusen” in three different languages—English, Spanish and French—on the streets of three different European cities. In just a couple of days, the band will set off to London, Paris and Madrid to sing their locally adapted songs in front of the eyes of the public—with the aim of spreading Swedish music culture and inviting people all over the world to tune into the band’s cross-European indie pop music.

Translate Tour 2_Credit Anton Olin.jpg

William Hellström from Vita Bergen will be performing his song in English, Spanish and French.

Last year Google Translate switched from phrase-based translation to Google Neural Machine Translation, which means that the tool now translates whole sentences at a time, rather than just piece by piece. It uses this broader context to figure out the most relevant translation, which it then rearranges and adjusts to be more like a human speaking with proper grammar.

Using this updated version of Google Translate, the English, Spanish and French translations of the song were close to flawless. The translations will also continue to improve, as the system learns from the more people using it.

Tune in to Vita Bergen’s release event, live streamed on YouTube today at 5:00 p.m. CEST, or listen to the songs in Swedish (“Tänd Ljusen”), English (“Light the Lights”), Spanish (“Enciende las Luces”) and French (“Allumez les Lumières”).

Tune in for the world’s first Google Translate music tour

Eleven years ago, Google Translate was created to break down language barriers. Since then, it has enabled billions of people and businesses all over the world to talk, connect and understand each other in new ways.

And we’re still re-imagining how it can be used—most recently, with music. The music industry in Sweden is one of the world's most successful exporters of hit music in English—with artists such Abba, The Cardigans and Avicii originating from the country. But there are still many talented Swedish artists who may not get the recognition or success they deserve except for in a small country up in the north.

This sparked an idea: might it be possible to use Google Translate with the sole purpose of breaking a Swedish band internationally?

Today, we’re presenting Translate Tour, in which up and coming Swedish indie pop group Vita Bergen will be using Google Translate to perform their new single “Tänd Ljusen” in three different languages—English, Spanish and French—on the streets of three different European cities. In just a couple of days, the band will set off to London, Paris and Madrid to sing their locally adapted songs in front of the eyes of the public—with the aim of spreading Swedish music culture and inviting people all over the world to tune into the band’s cross-European indie pop music.

Translate Tour 2_Credit Anton Olin.jpg

William Hellström from Vita Bergen will be performing his song in English, Spanish and French.

Last year Google Translate switched from phrase-based translation to Google Neural Machine Translation, which means that the tool now translates whole sentences at a time, rather than just piece by piece. It uses this broader context to figure out the most relevant translation, which it then rearranges and adjusts to be more like a human speaking with proper grammar.

Using this updated version of Google Translate, the English, Spanish and French translations of the song were close to flawless. The translations will also continue to improve, as the system learns from the more people using it.

Tune in to Vita Bergen’s release event, live streamed on YouTube today at 5:00 p.m. CEST, or listen to the songs in Swedish (“Tänd Ljusen”), English (“Light the Lights”), Spanish (“Enciende las Luces”) and French (“Allumez les Lumières”).

Source: Translate


Accelerating Deep Learning Research with the Tensor2Tensor Library



Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and compare the results.

Today, we are happy to release Tensor2Tensor (T2T), an open-source system for training deep learning models in TensorFlow. T2T facilitates the creation of state-of-the art models for a wide variety of ML applications, such as translation, parsing, image captioning and more, enabling the exploration of various ideas much faster than previously possible. This release also includes a library of datasets and models, including the best models from a few recent papers (Attention Is All You Need, Depthwise Separable Convolutions for Neural Machine Translation and One Model to Learn Them All) to help kick-start your own DL research.

Translation Model
Training time
BLEU (difference from baseline)
Transformer (T2T)
3 days on 8 GPU
28.4 (+7.8)
SliceNet (T2T)
6 days on 32 GPUs
26.1 (+5.5)
1 day on 64 GPUs
26.0 (+5.4)
ConvS2S
18 days on 1 GPU
25.1 (+4.5)
GNMT
1 day on 96 GPUs
24.6 (+4.0)
8 days on 32 GPUs
23.8 (+3.2)
MOSES (phrase-based baseline)
N/A
20.6 (+0.0)
BLEU scores (higher is better) on the standard WMT English-German translation task.
As an example of the kind of improvements T2T can offer, we applied the library to machine translation. As you can see in the table above, two different T2T models, SliceNet and Transformer, outperform the previous state-of-the-art, GNMT+MoE. Our best T2T model, Transformer, is 3.8 points better than the standard GNMT model, which itself was 4 points above the baseline phrase-based translation system, MOSES. Notably, with T2T you can approach previous state-of-the-art results with a single GPU in one day: a small Transformer model (not shown above) gets 24.9 BLEU after 1 day of training on a single GPU. Now everyone with a GPU can tinker with great translation models on their own: our github repo has instructions on how to do that.

Modular Multi-Task Training
The T2T library is built with familiar TensorFlow tools and defines multiple pieces needed in a deep learning system: data-sets, model architectures, optimizers, learning rate decay schemes, hyperparameters, and so on. Crucially, it enforces a standard interface between all these parts and implements current ML best practices. So you can pick any data-set, model, optimizer and set of hyperparameters, and run the training to check how it performs. We made the architecture modular, so every piece between the input data and the predicted output is a tensor-to-tensor function. If you have a new idea for the model architecture, you don’t need to replace the whole setup. You can keep the embedding part and the loss and everything else, just replace the model body by your own function that takes a tensor as input and returns a tensor.

This means that T2T is flexible, with training no longer pinned to a specific model or dataset. It is so easy that even architectures like the famous LSTM sequence-to-sequence model can be defined in a few dozen lines of code. One can also train a single model on multiple tasks from different domains. Taken to the limit, you can even train a single model on all data-sets concurrently, and we are happy to report that our MultiModel, trained like this and included in T2T, yields good results on many tasks even when training jointly on ImageNet (image classification), MS COCO (image captioning), WSJ (speech recognition), WMT (translation) and the Penn Treebank parsing corpus. It is the first time a single model has been demonstrated to be able to perform all these tasks at once.

Built-in Best Practices
With this initial release, we also provide scripts to generate a number of data-sets widely used in the research community1, a handful of models2, a number of hyperparameter configurations, and a well-performing implementation of other important tricks of the trade. While it is hard to list them all, if you decide to run your model with T2T you’ll get for free the correct padding of sequences and the corresponding cross-entropy loss, well-tuned parameters for the Adam optimizer, adaptive batching, synchronous distributed training, well-tuned data augmentation for images, label smoothing, and a number of hyper-parameter configurations that worked very well for us, including the ones mentioned above that achieve the state-of-the-art results on translation and may help you get good results too.

As an example, consider the task of parsing English sentences into their grammatical constituency trees. This problem has been studied for decades and competitive methods were developed with a lot of effort. It can be presented as a sequence-to-sequence problem and be solved with neural networks, but it used to require a lot of tuning. With T2T, it took us only a few days to add the parsing data-set generator and adjust our attention transformer model to train on this problem. To our pleasant surprise, we got very good results in only a week:

Parsing Model
F1 score (higher is better)
Transformer (T2T)
91.3
Dyer et al.
91.7
Zhu et al.
90.4
Socher et al.
90.4
Vinyals & Kaiser et al.
88.3
Parsing F1 scores on the standard test set, section 23 of the WSJ. We only compare here models trained discriminatively on the Penn Treebank WSJ training set, see the paper for more results.

Contribute to Tensor2Tensor
In addition to exploring existing models and data-sets, you can easily define your own model and add your own data-sets to Tensor2Tensor. We believe the already included models will perform very well for many NLP tasks, so just adding your data-set might lead to interesting results. By making T2T modular, we also make it very easy to contribute your own model and see how it performs on various tasks. In this way the whole community can benefit from a library of baselines and deep learning research can accelerate. So head to our github repository, try the new models, and contribute your own!

Acknowledgements
The release of Tensor2Tensor was only possible thanks to the widespread collaboration of many engineers and researchers. We want to acknowledge here the core team who contributed (in alphabetical order): Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit, Ashish Vaswani.



1 We include a number of datasets for image classification (MNIST, CIFAR-10, CIFAR-100, ImageNet), image captioning (MS COCO), translation (WMT with multiple languages including English-German and English-French), language modelling (LM1B), parsing (Penn Treebank), natural language inference (SNLI), speech recognition (TIMIT), algorithmic problems (over a dozen tasks from reversing through addition and multiplication to algebra) and we will be adding more and welcome your data-sets too.

2 Including LSTM sequence-to-sequence RNNs, convolutional networks also with separable convolutions (e.g., Xception), recently researched models like ByteNet or the Neural GPU, and our new state-of-the-art models mentioned in this post that we will be actively updating in the repository.