Applying Differential Privacy to Large Scale Image Classification

Machine learning (ML) models are becoming increasingly valuable for improved performance across a variety of consumer products, from recommendations to automatic image classification. However, despite aggregating large amounts of data, in theory it is possible for models to encode characteristics of individual entries from the training set. For example, experiments in controlled settings have shown that language models trained using email datasets may sometimes encode sensitive information included in the training data and may have the potential to reveal the presence of a particular user’s data in the training set. As such, it is important to prevent the encoding of such characteristics from individual training entries. To these ends, researchers are increasingly employing federated learning approaches.

Differential privacy (DP) provides a rigorous mathematical framework that allows researchers to quantify and understand the privacy guarantees of a system or an algorithm. Within the DP framework, privacy guarantees of a system are usually characterized by a positive parameter ε, called the privacy loss bound, with smaller ε corresponding to better privacy. One usually trains a model with DP guarantees using DP-SGD, a specialized training algorithm that provides DP guarantees for the trained model.

However training with DP-SGD typically has two major drawbacks. First, most existing implementations of DP-SGD are inefficient and slow, which makes it hard to use on large datasets. Second, DP-SGD training often significantly impacts utility (such as model accuracy) to the point that models trained with DP-SGD may become unusable in practice. As a result most DP research papers evaluate DP algorithms on very small datasets (MNIST, CIFAR-10, or UCI) and don’t even try to perform evaluation of larger datasets, such as ImageNet.

In “Toward Training at ImageNet Scale with Differential Privacy”, we share initial results from our ongoing effort to train a large image classification model on ImageNet using DP while maintaining high accuracy and minimizing computational cost. We show that the combination of various training techniques, such as careful choice of the model and hyperparameters, large batch training, and transfer learning from other datasets, can significantly boost accuracy of an ImageNet model trained with DP. To substantiate these discoveries and encourage follow-up research, we are also releasing the associated source code.

Testing Differential Privacy on ImageNet
We choose ImageNet classification as a demonstration of the practicality and efficacy of DP because: (1) it is an ambitious task for DP, for which no prior work shows sufficient progress; and (2) it is a public dataset on which other researchers can operate, so it represents an opportunity to collectively improve the utility of real-life DP training. Classification on ImageNet is challenging for DP because it requires large networks with many parameters. This translates into a significant amount of noise added into the computation, because the noise added scales with the size of the model.

Scaling Differential Privacy with JAX
Exploring multiple architectures and training configurations to research what works for DP can be debilitatingly slow. To streamline our efforts, we used JAX, a high-performance computational library based on XLA that can do efficient auto-vectorization and just-in-time compilation of the mathematical computations. Using these JAX features was previously recommended as a good way to speed up DP-SGD in the context of smaller datasets such as CIFAR-10.

We created our own implementation of DP-SGD on JAX and benchmarked it against the large ImageNet dataset (the code is included in our release). The implementation in JAX was relatively simple and resulted in noticeable performance gains simply because of using the XLA compiler. Compared to other implementations of DP-SGD, such as that in Tensorflow Privacy, the JAX implementation is consistently several times faster. It is typically even faster compared to the custom-built and optimized PyTorch Opacus.

Each step of our DP-SGD implementation takes approximately two forward-backward passes through the network. While this is slower than non-private training, which requires only a single forward-backward pass, it is still the most efficient known approach to train with the per-example gradients necessary for DP-SGD. The graph below shows training runtimes for two models on ImageNet with DP-SGD vs. non-private SGD, each on JAX. Overall, we find DP-SGD on JAX sufficiently fast to run large experiments just by slightly reducing the number of training runs used to find optimal hyperparameters compared to non-private training. This is significantly better than alternatives, such as Tensorflow Privacy, which we found to be ~5x–10x slower on our CIFAR10 and MNIST benchmarks.

Time in seconds per training epoch on ImageNet using a Resnet18 or Resnet50 architecture with 8 V100 GPUs.

Combining Techniques for Improved Accuracy
It is possible that future training algorithms may improve DP’s privacy-utility tradeoff. However, with current algorithms, such as DP-SGD, our experience points to an engineering “bag-of-tricks” approach to make DP more practical on challenging tasks like ImageNet.

Because we can train models faster with JAX, we can iterate quickly and explore multiple configurations to find what works well for DP. We report the following combination of techniques as useful to achieve non-trivial accuracy and privacy on ImageNet:

  • Full-batch training

    Theoretically, it is known that larger minibatch sizes improve the utility of DP-SGD, with full-batch training (i.e., where a full dataset is one batch) giving the best utility [1, 2], and empirical results are emerging to support this theory. Indeed, our experiments demonstrate that increasing the batch size along with the number of training epochs leads to a decrease in ε while still maintaining accuracy. However, training with extremely large batches is non-trivial as the batch cannot fit into GPU/TPU memory. So, we employed virtual large-batch training by accumulating gradients for multiple steps before updating the weights instead of applying gradient updates on each training step.

    Batch size 1024 4 × 1024 16 × 1024 64 × 1024
    Number of epochs 10 40 160 640
    Accuracy 56% 57.5% 57.9% 57.2%
    Privacy loss bound ε 9.8 × 108 6.1 × 107 3.5 × 106 6.7 × 104

  • Transfer learning from public data

    Pre-training on public data followed by DP fine-tuning on private data has previously been shown to improve accuracy on other benchmarks [3, 4]. A question that remains is what public data to use for a given task to optimize transfer learning. In this work we simulate a private/public data split by using ImageNet as "private" data and using Places365, another image classification dataset, as a proxy for “public" data. We pre-trained our models on Places365 before fine-tuning them with DP-SGD on ImageNet. Places365 only has images of landscapes and buildings, not of animals as ImageNet, so it is quite different, making it a good candidate to demonstrate the ability of the model to transfer to a different but related domain.

    We found that transfer learning from Places365 gave us 47.5% accuracy on ImageNet with a reasonable level of privacy (ε = 10). This is low compared to the 70% accuracy of a similar non-private model, but compared to naïve DP training on ImageNet, which yields either very low accuracy (2 - 5%) or no privacy (ε=109), this is quite good.

Privacy-accuracy tradeoff for Resnet-18 on ImageNet using large-batch training with transfer learning from Places365.

Next Steps
We hope these early results and source code provide an impetus for other researchers to work on improving DP for ambitious tasks such as ImageNet as a proxy for challenging production-scale tasks. With the much faster DP-SGD on JAX, we urge DP and ML researchers to explore diverse training regimes, model architectures, and algorithms to make DP more practical. To continue advancing the state of the field, we recommend researchers start with a baseline that incorporates full-batch training plus transfer learning.

Acknowledgments
This work was carried out with the support of the Google Visiting Researcher Program while Prof. Geambasu, an Associate Professor with Columbia University, was on sabbatical with Google Research. This work received substantial contributions from Steve Chien, Shuang Song, Andreas Terzis and Abhradeep Guha Thakurta.

Source: Google AI Blog


Applying Differential Privacy to Large Scale Image Classification

Machine learning (ML) models are becoming increasingly valuable for improved performance across a variety of consumer products, from recommendations to automatic image classification. However, despite aggregating large amounts of data, in theory it is possible for models to encode characteristics of individual entries from the training set. For example, experiments in controlled settings have shown that language models trained using email datasets may sometimes encode sensitive information included in the training data and may have the potential to reveal the presence of a particular user’s data in the training set. As such, it is important to prevent the encoding of such characteristics from individual training entries. To these ends, researchers are increasingly employing federated learning approaches.

Differential privacy (DP) provides a rigorous mathematical framework that allows researchers to quantify and understand the privacy guarantees of a system or an algorithm. Within the DP framework, privacy guarantees of a system are usually characterized by a positive parameter ε, called the privacy loss bound, with smaller ε corresponding to better privacy. One usually trains a model with DP guarantees using DP-SGD, a specialized training algorithm that provides DP guarantees for the trained model.

However training with DP-SGD typically has two major drawbacks. First, most existing implementations of DP-SGD are inefficient and slow, which makes it hard to use on large datasets. Second, DP-SGD training often significantly impacts utility (such as model accuracy) to the point that models trained with DP-SGD may become unusable in practice. As a result most DP research papers evaluate DP algorithms on very small datasets (MNIST, CIFAR-10, or UCI) and don’t even try to perform evaluation of larger datasets, such as ImageNet.

In “Toward Training at ImageNet Scale with Differential Privacy”, we share initial results from our ongoing effort to train a large image classification model on ImageNet using DP while maintaining high accuracy and minimizing computational cost. We show that the combination of various training techniques, such as careful choice of the model and hyperparameters, large batch training, and transfer learning from other datasets, can significantly boost accuracy of an ImageNet model trained with DP. To substantiate these discoveries and encourage follow-up research, we are also releasing the associated source code.

Testing Differential Privacy on ImageNet
We choose ImageNet classification as a demonstration of the practicality and efficacy of DP because: (1) it is an ambitious task for DP, for which no prior work shows sufficient progress; and (2) it is a public dataset on which other researchers can operate, so it represents an opportunity to collectively improve the utility of real-life DP training. Classification on ImageNet is challenging for DP because it requires large networks with many parameters. This translates into a significant amount of noise added into the computation, because the noise added scales with the size of the model.

Scaling Differential Privacy with JAX
Exploring multiple architectures and training configurations to research what works for DP can be debilitatingly slow. To streamline our efforts, we used JAX, a high-performance computational library based on XLA that can do efficient auto-vectorization and just-in-time compilation of the mathematical computations. Using these JAX features was previously recommended as a good way to speed up DP-SGD in the context of smaller datasets such as CIFAR-10.

We created our own implementation of DP-SGD on JAX and benchmarked it against the large ImageNet dataset (the code is included in our release). The implementation in JAX was relatively simple and resulted in noticeable performance gains simply because of using the XLA compiler. Compared to other implementations of DP-SGD, such as that in Tensorflow Privacy, the JAX implementation is consistently several times faster. It is typically even faster compared to the custom-built and optimized PyTorch Opacus.

Each step of our DP-SGD implementation takes approximately two forward-backward passes through the network. While this is slower than non-private training, which requires only a single forward-backward pass, it is still the most efficient known approach to train with the per-example gradients necessary for DP-SGD. The graph below shows training runtimes for two models on ImageNet with DP-SGD vs. non-private SGD, each on JAX. Overall, we find DP-SGD on JAX sufficiently fast to run large experiments just by slightly reducing the number of training runs used to find optimal hyperparameters compared to non-private training. This is significantly better than alternatives, such as Tensorflow Privacy, which we found to be ~5x–10x slower on our CIFAR10 and MNIST benchmarks.

Time in seconds per training epoch on ImageNet using a Resnet18 or Resnet50 architecture with 8 V100 GPUs.

Combining Techniques for Improved Accuracy
It is possible that future training algorithms may improve DP’s privacy-utility tradeoff. However, with current algorithms, such as DP-SGD, our experience points to an engineering “bag-of-tricks” approach to make DP more practical on challenging tasks like ImageNet.

Because we can train models faster with JAX, we can iterate quickly and explore multiple configurations to find what works well for DP. We report the following combination of techniques as useful to achieve non-trivial accuracy and privacy on ImageNet:

  • Full-batch training

    Theoretically, it is known that larger minibatch sizes improve the utility of DP-SGD, with full-batch training (i.e., where a full dataset is one batch) giving the best utility [1, 2], and empirical results are emerging to support this theory. Indeed, our experiments demonstrate that increasing the batch size along with the number of training epochs leads to a decrease in ε while still maintaining accuracy. However, training with extremely large batches is non-trivial as the batch cannot fit into GPU/TPU memory. So, we employed virtual large-batch training by accumulating gradients for multiple steps before updating the weights instead of applying gradient updates on each training step.

    Batch size 1024 4 × 1024 16 × 1024 64 × 1024
    Number of epochs 10 40 160 640
    Accuracy 56% 57.5% 57.9% 57.2%
    Privacy loss bound ε 9.8 × 108 6.1 × 107 3.5 × 106 6.7 × 104

  • Transfer learning from public data

    Pre-training on public data followed by DP fine-tuning on private data has previously been shown to improve accuracy on other benchmarks [3, 4]. A question that remains is what public data to use for a given task to optimize transfer learning. In this work we simulate a private/public data split by using ImageNet as "private" data and using Places365, another image classification dataset, as a proxy for “public" data. We pre-trained our models on Places365 before fine-tuning them with DP-SGD on ImageNet. Places365 only has images of landscapes and buildings, not of animals as ImageNet, so it is quite different, making it a good candidate to demonstrate the ability of the model to transfer to a different but related domain.

    We found that transfer learning from Places365 gave us 47.5% accuracy on ImageNet with a reasonable level of privacy (ε = 10). This is low compared to the 70% accuracy of a similar non-private model, but compared to naïve DP training on ImageNet, which yields either very low accuracy (2 - 5%) or no privacy (ε=109), this is quite good.

Privacy-accuracy tradeoff for Resnet-18 on ImageNet using large-batch training with transfer learning from Places365.

Next Steps
We hope these early results and source code provide an impetus for other researchers to work on improving DP for ambitious tasks such as ImageNet as a proxy for challenging production-scale tasks. With the much faster DP-SGD on JAX, we urge DP and ML researchers to explore diverse training regimes, model architectures, and algorithms to make DP more practical. To continue advancing the state of the field, we recommend researchers start with a baseline that incorporates full-batch training plus transfer learning.

Acknowledgments
This work was carried out with the support of the Google Visiting Researcher Program while Prof. Geambasu, an Associate Professor with Columbia University, was on sabbatical with Google Research. This work received substantial contributions from Steve Chien, Shuang Song, Andreas Terzis and Abhradeep Guha Thakurta.

Source: Google AI Blog


Applying Differential Privacy to Large Scale Image Classification

Machine learning (ML) models are becoming increasingly valuable for improved performance across a variety of consumer products, from recommendations to automatic image classification. However, despite aggregating large amounts of data, in theory it is possible for models to encode characteristics of individual entries from the training set. For example, experiments in controlled settings have shown that language models trained using email datasets may sometimes encode sensitive information included in the training data and may have the potential to reveal the presence of a particular user’s data in the training set. As such, it is important to prevent the encoding of such characteristics from individual training entries. To these ends, researchers are increasingly employing federated learning approaches.

Differential privacy (DP) provides a rigorous mathematical framework that allows researchers to quantify and understand the privacy guarantees of a system or an algorithm. Within the DP framework, privacy guarantees of a system are usually characterized by a positive parameter ε, called the privacy loss bound, with smaller ε corresponding to better privacy. One usually trains a model with DP guarantees using DP-SGD, a specialized training algorithm that provides DP guarantees for the trained model.

However training with DP-SGD typically has two major drawbacks. First, most existing implementations of DP-SGD are inefficient and slow, which makes it hard to use on large datasets. Second, DP-SGD training often significantly impacts utility (such as model accuracy) to the point that models trained with DP-SGD may become unusable in practice. As a result most DP research papers evaluate DP algorithms on very small datasets (MNIST, CIFAR-10, or UCI) and don’t even try to perform evaluation of larger datasets, such as ImageNet.

In “Toward Training at ImageNet Scale with Differential Privacy”, we share initial results from our ongoing effort to train a large image classification model on ImageNet using DP while maintaining high accuracy and minimizing computational cost. We show that the combination of various training techniques, such as careful choice of the model and hyperparameters, large batch training, and transfer learning from other datasets, can significantly boost accuracy of an ImageNet model trained with DP. To substantiate these discoveries and encourage follow-up research, we are also releasing the associated source code.

Testing Differential Privacy on ImageNet
We choose ImageNet classification as a demonstration of the practicality and efficacy of DP because: (1) it is an ambitious task for DP, for which no prior work shows sufficient progress; and (2) it is a public dataset on which other researchers can operate, so it represents an opportunity to collectively improve the utility of real-life DP training. Classification on ImageNet is challenging for DP because it requires large networks with many parameters. This translates into a significant amount of noise added into the computation, because the noise added scales with the size of the model.

Scaling Differential Privacy with JAX
Exploring multiple architectures and training configurations to research what works for DP can be debilitatingly slow. To streamline our efforts, we used JAX, a high-performance computational library based on XLA that can do efficient auto-vectorization and just-in-time compilation of the mathematical computations. Using these JAX features was previously recommended as a good way to speed up DP-SGD in the context of smaller datasets such as CIFAR-10.

We created our own implementation of DP-SGD on JAX and benchmarked it against the large ImageNet dataset (the code is included in our release). The implementation in JAX was relatively simple and resulted in noticeable performance gains simply because of using the XLA compiler. Compared to other implementations of DP-SGD, such as that in Tensorflow Privacy, the JAX implementation is consistently several times faster. It is typically even faster compared to the custom-built and optimized PyTorch Opacus.

Each step of our DP-SGD implementation takes approximately two forward-backward passes through the network. While this is slower than non-private training, which requires only a single forward-backward pass, it is still the most efficient known approach to train with the per-example gradients necessary for DP-SGD. The graph below shows training runtimes for two models on ImageNet with DP-SGD vs. non-private SGD, each on JAX. Overall, we find DP-SGD on JAX sufficiently fast to run large experiments just by slightly reducing the number of training runs used to find optimal hyperparameters compared to non-private training. This is significantly better than alternatives, such as Tensorflow Privacy, which we found to be ~5x–10x slower on our CIFAR10 and MNIST benchmarks.

Time in seconds per training epoch on ImageNet using a Resnet18 or Resnet50 architecture with 8 V100 GPUs.

Combining Techniques for Improved Accuracy
It is possible that future training algorithms may improve DP’s privacy-utility tradeoff. However, with current algorithms, such as DP-SGD, our experience points to an engineering “bag-of-tricks” approach to make DP more practical on challenging tasks like ImageNet.

Because we can train models faster with JAX, we can iterate quickly and explore multiple configurations to find what works well for DP. We report the following combination of techniques as useful to achieve non-trivial accuracy and privacy on ImageNet:

  • Full-batch training

    Theoretically, it is known that larger minibatch sizes improve the utility of DP-SGD, with full-batch training (i.e., where a full dataset is one batch) giving the best utility [1, 2], and empirical results are emerging to support this theory. Indeed, our experiments demonstrate that increasing the batch size along with the number of training epochs leads to a decrease in ε while still maintaining accuracy. However, training with extremely large batches is non-trivial as the batch cannot fit into GPU/TPU memory. So, we employed virtual large-batch training by accumulating gradients for multiple steps before updating the weights instead of applying gradient updates on each training step.

    Batch size 1024 4 × 1024 16 × 1024 64 × 1024
    Number of epochs 10 40 160 640
    Accuracy 56% 57.5% 57.9% 57.2%
    Privacy loss bound ε 9.8 × 108 6.1 × 107 3.5 × 106 6.7 × 104

  • Transfer learning from public data

    Pre-training on public data followed by DP fine-tuning on private data has previously been shown to improve accuracy on other benchmarks [3, 4]. A question that remains is what public data to use for a given task to optimize transfer learning. In this work we simulate a private/public data split by using ImageNet as "private" data and using Places365, another image classification dataset, as a proxy for “public" data. We pre-trained our models on Places365 before fine-tuning them with DP-SGD on ImageNet. Places365 only has images of landscapes and buildings, not of animals as ImageNet, so it is quite different, making it a good candidate to demonstrate the ability of the model to transfer to a different but related domain.

    We found that transfer learning from Places365 gave us 47.5% accuracy on ImageNet with a reasonable level of privacy (ε = 10). This is low compared to the 70% accuracy of a similar non-private model, but compared to naïve DP training on ImageNet, which yields either very low accuracy (2 - 5%) or no privacy (ε=109), this is quite good.

Privacy-accuracy tradeoff for Resnet-18 on ImageNet using large-batch training with transfer learning from Places365.

Next Steps
We hope these early results and source code provide an impetus for other researchers to work on improving DP for ambitious tasks such as ImageNet as a proxy for challenging production-scale tasks. With the much faster DP-SGD on JAX, we urge DP and ML researchers to explore diverse training regimes, model architectures, and algorithms to make DP more practical. To continue advancing the state of the field, we recommend researchers start with a baseline that incorporates full-batch training plus transfer learning.

Acknowledgments
This work was carried out with the support of the Google Visiting Researcher Program while Prof. Geambasu, an Associate Professor with Columbia University, was on sabbatical with Google Research. This work received substantial contributions from Steve Chien, Shuang Song, Andreas Terzis and Abhradeep Guha Thakurta.

Source: Google AI Blog


Applying Differential Privacy to Large Scale Image Classification

Machine learning (ML) models are becoming increasingly valuable for improved performance across a variety of consumer products, from recommendations to automatic image classification. However, despite aggregating large amounts of data, in theory it is possible for models to encode characteristics of individual entries from the training set. For example, experiments in controlled settings have shown that language models trained using email datasets may sometimes encode sensitive information included in the training data and may have the potential to reveal the presence of a particular user’s data in the training set. As such, it is important to prevent the encoding of such characteristics from individual training entries. To these ends, researchers are increasingly employing federated learning approaches.

Differential privacy (DP) provides a rigorous mathematical framework that allows researchers to quantify and understand the privacy guarantees of a system or an algorithm. Within the DP framework, privacy guarantees of a system are usually characterized by a positive parameter ε, called the privacy loss bound, with smaller ε corresponding to better privacy. One usually trains a model with DP guarantees using DP-SGD, a specialized training algorithm that provides DP guarantees for the trained model.

However training with DP-SGD typically has two major drawbacks. First, most existing implementations of DP-SGD are inefficient and slow, which makes it hard to use on large datasets. Second, DP-SGD training often significantly impacts utility (such as model accuracy) to the point that models trained with DP-SGD may become unusable in practice. As a result most DP research papers evaluate DP algorithms on very small datasets (MNIST, CIFAR-10, or UCI) and don’t even try to perform evaluation of larger datasets, such as ImageNet.

In “Toward Training at ImageNet Scale with Differential Privacy”, we share initial results from our ongoing effort to train a large image classification model on ImageNet using DP while maintaining high accuracy and minimizing computational cost. We show that the combination of various training techniques, such as careful choice of the model and hyperparameters, large batch training, and transfer learning from other datasets, can significantly boost accuracy of an ImageNet model trained with DP. To substantiate these discoveries and encourage follow-up research, we are also releasing the associated source code.

Testing Differential Privacy on ImageNet
We choose ImageNet classification as a demonstration of the practicality and efficacy of DP because: (1) it is an ambitious task for DP, for which no prior work shows sufficient progress; and (2) it is a public dataset on which other researchers can operate, so it represents an opportunity to collectively improve the utility of real-life DP training. Classification on ImageNet is challenging for DP because it requires large networks with many parameters. This translates into a significant amount of noise added into the computation, because the noise added scales with the size of the model.

Scaling Differential Privacy with JAX
Exploring multiple architectures and training configurations to research what works for DP can be debilitatingly slow. To streamline our efforts, we used JAX, a high-performance computational library based on XLA that can do efficient auto-vectorization and just-in-time compilation of the mathematical computations. Using these JAX features was previously recommended as a good way to speed up DP-SGD in the context of smaller datasets such as CIFAR-10.

We created our own implementation of DP-SGD on JAX and benchmarked it against the large ImageNet dataset (the code is included in our release). The implementation in JAX was relatively simple and resulted in noticeable performance gains simply because of using the XLA compiler. Compared to other implementations of DP-SGD, such as that in Tensorflow Privacy, the JAX implementation is consistently several times faster. It is typically even faster compared to the custom-built and optimized PyTorch Opacus.

Each step of our DP-SGD implementation takes approximately two forward-backward passes through the network. While this is slower than non-private training, which requires only a single forward-backward pass, it is still the most efficient known approach to train with the per-example gradients necessary for DP-SGD. The graph below shows training runtimes for two models on ImageNet with DP-SGD vs. non-private SGD, each on JAX. Overall, we find DP-SGD on JAX sufficiently fast to run large experiments just by slightly reducing the number of training runs used to find optimal hyperparameters compared to non-private training. This is significantly better than alternatives, such as Tensorflow Privacy, which we found to be ~5x–10x slower on our CIFAR10 and MNIST benchmarks.

Time in seconds per training epoch on ImageNet using a Resnet18 or Resnet50 architecture with 8 V100 GPUs.

Combining Techniques for Improved Accuracy
It is possible that future training algorithms may improve DP’s privacy-utility tradeoff. However, with current algorithms, such as DP-SGD, our experience points to an engineering “bag-of-tricks” approach to make DP more practical on challenging tasks like ImageNet.

Because we can train models faster with JAX, we can iterate quickly and explore multiple configurations to find what works well for DP. We report the following combination of techniques as useful to achieve non-trivial accuracy and privacy on ImageNet:

  • Full-batch training

    Theoretically, it is known that larger minibatch sizes improve the utility of DP-SGD, with full-batch training (i.e., where a full dataset is one batch) giving the best utility [1, 2], and empirical results are emerging to support this theory. Indeed, our experiments demonstrate that increasing the batch size along with the number of training epochs leads to a decrease in ε while still maintaining accuracy. However, training with extremely large batches is non-trivial as the batch cannot fit into GPU/TPU memory. So, we employed virtual large-batch training by accumulating gradients for multiple steps before updating the weights instead of applying gradient updates on each training step.

    Batch size 1024 4 × 1024 16 × 1024 64 × 1024
    Number of epochs 10 40 160 640
    Accuracy 56% 57.5% 57.9% 57.2%
    Privacy loss bound ε 9.8 × 108 6.1 × 107 3.5 × 106 6.7 × 104

  • Transfer learning from public data

    Pre-training on public data followed by DP fine-tuning on private data has previously been shown to improve accuracy on other benchmarks [3, 4]. A question that remains is what public data to use for a given task to optimize transfer learning. In this work we simulate a private/public data split by using ImageNet as "private" data and using Places365, another image classification dataset, as a proxy for “public" data. We pre-trained our models on Places365 before fine-tuning them with DP-SGD on ImageNet. Places365 only has images of landscapes and buildings, not of animals as ImageNet, so it is quite different, making it a good candidate to demonstrate the ability of the model to transfer to a different but related domain.

    We found that transfer learning from Places365 gave us 47.5% accuracy on ImageNet with a reasonable level of privacy (ε = 10). This is low compared to the 70% accuracy of a similar non-private model, but compared to naïve DP training on ImageNet, which yields either very low accuracy (2 - 5%) or no privacy (ε=109), this is quite good.

Privacy-accuracy tradeoff for Resnet-18 on ImageNet using large-batch training with transfer learning from Places365.

Next Steps
We hope these early results and source code provide an impetus for other researchers to work on improving DP for ambitious tasks such as ImageNet as a proxy for challenging production-scale tasks. With the much faster DP-SGD on JAX, we urge DP and ML researchers to explore diverse training regimes, model architectures, and algorithms to make DP more practical. To continue advancing the state of the field, we recommend researchers start with a baseline that incorporates full-batch training plus transfer learning.

Acknowledgments
This work was carried out with the support of the Google Visiting Researcher Program while Prof. Geambasu, an Associate Professor with Columbia University, was on sabbatical with Google Research. This work received substantial contributions from Steve Chien, Shuang Song, Andreas Terzis and Abhradeep Guha Thakurta.

Source: Google AI Blog


Your hybrid meetings could be better — here’s how

As we explored in a recent global survey we commissioned from Economist Impact, employees around the world are looking for new ways of working and connecting with each other and their organizations as remote and hybrid work models continue to evolve. Creating a blueprint for more inclusive and collaborative meetings can help teams feel more connected—wherever and however they work together.

Scheduling meetings

If you work with people in other time zones, then you know scheduling can be a logistical headache. At Google, we follow a few guidelines to optimize participation.

Share information for better scheduling: Encourage your team to add working hours, location and focus time into Calendars.

Only invite those who can contribute: If you aren’t sure, invite less meeting-essential teammates as “optional.”

Choose dates and times that work for more people: For teams in distant time zones,add other timezones to your calendar to schedule global meetings well in advance and discuss alternating host time zones for regular calls.

Add an agenda to the Calendar invite: Let people know at least 24 hours in advance what a meeting will be about — for example: “This meeting will be successful if we leave with four great ideas from the brainstorming session” — so participants can prepare. And don’t forget you can schedule send the agenda to arrive right before the meeting or at the correct time for different timezone attendees.

Encourage RSVPing with location: Have attendees share whether they will attend “in a meeting room” or if they are “joining virtually” so everyone, including the organizer, knows what to expect.

Rotate facilitator and note-taker roles: Having team members alternate roles lessens the burden on one person and gives everyone a chance to participate more fully.

Prep with Spaces

Spaces is Google Workspace’s central place for team collaboration. It works closely with tools like Gmail, Calendar, Chat, Drive and Meet so coworkers can digitally work on projects, share ideas and even connect on a personal level better. Participants can prepare for meetings by reviewing documents and presentations side-by-side and collaborate with questions and suggestions, with everything saved in Spaces for future reference.

A screenshot showing Google Spaces in use.

Accessing content directly from Spaces can help meeting attendees stay up to date.

During the meeting

Good hybrid meetings shouldn’t feel like two different conversations that happen in the room and remotely. To keep them feeling like a single inclusive experience, try the following:

Help virtual team members connect: Acknowledge when remote teammates join and use the first five minutes to connect. Some Google teams start by asking questions like “what was the best thing you ate this weekend?” or playing interesting YouTube videos.

Keep and share meeting notes: Notetakers can use a pre-populated notes Doc in the Calendar invite or even meeting recordings to share what happened with attendees and anyone who couldn’t attend.

Collaborate with Companion mode: Google Meet’s Companion mode can help everyone participate, no matter where they are. For people in the conference room, Companion mode turns off the video and audio on laptops so participants can use functions like chat, screen sharing, hand-raising, polls, host controls and more, while avoiding feedback with the conference room hardware. Additionally, team members can also enable captions and translations in their preferred language and view presentations up-close on their own device.

Foster inclusivity: Facilitators can make sure everyone feels heard by encouraging remote contributions, avoiding “in the room” side conversations and reminding mixed language teams to use translated captions.

Provide multiple ways to give feedback: Not everyone is comfortable speaking in a meeting, so make sure people know they can use the chat option, or try using the poll feature to engage everyone in offering input.

Use virtual rather than physical whiteboards: With Jamboard or the Jamboard app, remote attendees can also view and contribute.

An image of a person looking into the camera and smiling in a Google Meet call.

Join Companion mode by selecting “Use Companion mode” under Other joining options.

After the meeting

Many of us have experienced meeting fatigue as our teams became more distributed during the last two years. But it’s always crucial to make sure attendees feel like their time is well spent, and there are a few ways you can do that. For starters, try sending a follow-up note thanking attendees for coming, asking for feedback and sharing any notes, recordings, action items, and decisions. You can also post meeting assets to the relevant Spaces so absent team members can contribute. It’s also a good idea to gather general feedback for recurring meetings — try polling people once a quarter using Google Forms, possibly anonymously — about how the meeting could be made more productive and inclusive.

Discover more tips and best practices

As hybrid meetings become the norm for millions of people, discovering and encouraging best practices that make meetings more inclusive is an essential part of the evolving future of work.

Discover more hybrid work tips and best practices on our future of work site.

Grow your game’s revenue with Google Play Console’s new strategic guidance

Posted by Phalene Gowling, Product Manager, Google Play

light blue illustration with coin bouncing

Last year, mobile game consumer spending grew 7.3% to $93.2 billion with no signs of slowing down. In this competitive, growing market, effectively monetizing your audience has never been more important. But without access to a strategy consultant, how can you know if your monetization strategy is as strong as it can be?

That’s why we’re expanding the suite of tools available in Play Console to help it be exactly that. Last year, we released new engagement and monetization metrics on the Statistics page to help you grow your business, and now we’re pleased to announce new strategic guidance tools to help you drive successful monetization.

In this new section, you’ll see our metric-driven guidance to help you better monetize your game by:

  1. Contextualizing your topline revenue: Understand how your game’s revenue metrics contribute to your overall business goals, and learn when to prioritize optimizing for one metric over another.
  2. Identifying opportunities: Find out where there is an opportunity to improve a metric by benchmarking against peer groups, and explore insights by country.
  3. Recommending next steps: Learn how to take advantage of monetization opportunities with specific actions you can take right away.
screenshot of strategic guidance for monetization webpage in Google Play Console

The strategic guidance metric hierarchy. (Learn more or visit our Play Academy for specific courses like monitoring KPIs.)

We’ve spent the last couple of years perfecting our guidance, and testing the dashboard with selected partners. Feedback on our strategic guidance has been positive — and we hope you’ll find it useful, too.

“This is extremely useful! These type of insights are actually what we expect from Google, because this is something that really can help us to scale our business.”

- Product Manager at Gameloft


Understand key monetization drivers and their relationships with the metric hierarchy

Strategic guidance can be found in Financial reports within Play Console. In partnership with experts in mobile games growth, we’ve included primary monetization metrics (including new metrics) and their relationships to help you easily assess your performance and measure against your peers. You can see all the metrics in this Help Center article.

The metric hierarchy is a tool to help you understand how you and your teams can directly influence the lower-level metrics of your games performance, like buyer conversions, which contribute to your overall top-line business performance. Using peerset comparisons and per-country breakdowns, you can quickly identify your biggest growth opportunities: what markets are underperforming and where you are a market leader.


Explore metric analysis to turn insights into action

Select a metric and explore it in detail to track your performance over time. Strategic guidance shows you a breakdown of your chosen metric by location to help you spot opportunities to expand your game globally. The detailed metric analysis also helps you identify where a small investment has an outsized return.

Strategic guidance metric recommendation example for returning daily buyer ratio.

Strategic guidance metric recommendation example for returning daily buyer ratio.

Whether you’ve created a casual game or an RPG, the metric-specific recommendations are designed to be insightful and relevant to a variety of game developers. They can be used to help you diversify your promotional content, refine your game mechanics, or test new price points that enable purchasing power parity.


Get IAP monetization guidance today, with more insights to come

With an increasing number of developers shifting focus from an ads-only monetization business model to include in-app purchases (IAP), we’ve developed strategic guidance to be most relevant for developers that include IAP-monetization as part of their overall strategy. With this launch, we’re excited to bring growth consulting opportunities to these game developers at scale. Stay tuned for more launches this year to help you successfully drive your revenue growth.

Dev Channel Update for Desktop

The Dev channel has been updated to 99.0.4844.16 for Windows,Linux and 99.0.4844.15 for Mac

A partial list of changes is available in the log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Prudhvikumar Bommana

Google Chrome

Happy Lunar New Year! Get ready for the Year of the Tiger

Starting February 1, about a quarter of the world, including much of East Asia, will be celebrating Lunar New Year, and the beginning of the Year of the Tiger. As a tiger year, 2022 will be associated with the animal’s attributes of bravery, confidence and strong will. Previous tiger years include 2010, 1998, 1986, 1974, 1962, 1950 and 1938.

We’re celebrating Lunar New Year with a few fun features like an animated Doodle on Search and a special, firework-filled surprise when you look up “Lunar New Year.” On Android, you can choose from more than 200 new Year of the Tiger-inspired Emoji Kitchen combinations in Gboard. And don’t forget to wish Google Assistant a happy Lunar New Year and hear the response.

During the upcoming days and weeks of festivities, Lunar New Year will be welcomed with feasting, the honoring of ancestors and deities and those who celebrate will be focused on fortune, happiness and prosperity. We wanted to hear how Googlers are honoring these traditions, what new ones they’ve created and what they’re most excited about for this new year. With help from the Asian Google Network, we learned about how our very own colleagues celebrate this holiday.

What is your favorite memory of celebrating a Lunar New Year with your family?

What current traditions do you and your family incorporate in your Lunar New Year celebrations?

The Year of the Tiger will be about making big changes, risk-taking and adventure. How are you planning to be courageous in 2022?

Happy Lunar New Year, we hope the year of the tiger is full of adventure and hope for everyone.

Connecting people with domestic violence support

Editor’s note: The following content contains information related to intimate partner violence, which may be triggering to survivors.

Intimate partner violence (IPV), also known as domestic violence or relationship abuse, is a pattern of behavior used to gain or maintain power and control over a partner, and it affects more than 12 million people in the United States every year. One in four women and one in seven men aged 18 and older in the U.S. will be the victim of severe physical violence from an intimate partner in their lifetime. Relationship abuse comes in many forms — apparent or invisible; emotional, financial, digital or physical; in families, in couples and in relationships of all kinds — and the risk has only grown during the COVID-19 pandemic.

The National Domestic Violence Hotline (The Hotline) is a vital service. Our mission is to answer the call to support and shift power back to those affected by relationship abuse — 24 hours a day, 7 days a week, 365 days a year. Now, The Hotline is working with Google to help provide quicker access to domestic violence information and 24/7 support to those who need it.

Starting today, when people in the U.S. search for information related to domestic violence on Google, they will see a box at the top of the search results displaying the contact information for The Hotline – with direct access to our phone and chat services. This will help survivors, especially those in crisis, get the information and connection to the 24/7 support they need quickly and with less scrolling. Finding the right information quickly is essential for survivors, as their window to safely reach out for support may be limited.

An image of the Google search results page for the query “domestic violence help,” showing a new box at the top of the results with resources for the National Domestic Violence Hotline.

The Hotline is the only national 24-hour domestic violence hotline providing compassionate support, life-saving resources and personalized safety planning via phone, online chat and text. Services are provided in English and Spanish through bilingual advocates and in more than 200 other languages through interpretation.

As always, the safety of survivors is our primary concern. Our advocates are available 24/7 to answer questions and give advice about internet safety for survivors. They can suggest things like:

  • Managing your search and browsing history
  • Using computers found at your local library, Internet cafe, shelter, workplace or computer technology center
  • Setting up an alternate email account that a partner doesn’t know about

Swiftly connecting survivors with support is critical right now, because the COVID-19 pandemic has only exacerbated risk factors for domestic violence. More people are experiencing isolation and are limited in their ability to go to work, go to school or see friends and family. The pandemic has also accelerated an increase in economic and housing instability for many experiencing abuse: Since 2016, housing instability as a concern has grown by an average of 20% annually among those contacting The Hotline.

All these factors heighten the risk of relationship abuse, and limit the ability of survivors to reach out for support. COVID-19 has left many survivors feeling trapped in their homes with an abusive partner – with limited time to get resources or information. To learn more about how the pandemic has affected survivors, read our Year One report.

Through this partnership with Google, we want to ensure the first resource people see when searching for domestic violence is reliable, helpful information that empowers them to get support as quickly as possible.

VPN by Google One comes to iOS

When we launched the VPN by Google One for Android in the U.S., we wanted to give you an extra layer of online protection for your phone and the peace of mind that your connection is safer. Today, we’re sharing the latest updates to the VPN, which help bring protection to even more people and make it even easier to use.

Bringing peace of mind to iOS users and more countries

Today, we’re beginning to roll out the VPN to iOS devices. Similar to Android, the VPN will be available to Google One members on Premium plans (2 TB and higher) through the Google One app on iOS. Plus, members can share their plan and the VPN with up to five family members at no extra cost, so they can all use the VPN, no matter whether they’re using an Android or iOS phone.

A smartphone displays an animation of the VPN by Google One app.

We also recently expanded the VPN in 10 more countries: Austria, Belgium, Denmark, Finland, Iceland, Ireland, the Netherlands, Norway, Sweden and Switzerland. And we’ll expand to more countries over time.

Making the VPN by Google One even easier to use

You may have seen that we also added features to the VPN for Android that make it even easier to use:

  • Safe Disconnect: Only use the internet when the VPN is activated.
  • App Bypass: Allow specific apps to use a standard connection instead of the VPN.
  • Snooze: Temporarily turn off the VPN.

Privacy and security are always core to everything we make. Our systems have advanced security built in to help ensure no one uses the VPN to tie your online activity to your identity. Our client libraries are also open sourced, and our end-to-end systems have been independently audited. Our VPN has the full certification from the Internet of Secure Things Alliance (ioXt) and passed all eight of ioXt’s security principles.

We'll keep adding more security features and benefits to our Premium plans, so you know your data is safe. If you’re not already a member, you can sign up for a 2 TB Google One plan.

A graphic showing and comparing the various Google One plans; Basic, Standard, and Premium.

Google Store rewards give you Store credit on hardware purchases from the Google Store. This is only available in the U.S., Canada, Australia, Germany and the United Kingdom. Pro Sessions have limited appointments available on a first come, first served basis. Google Photos editing features are only available on devices with at least 3 GB RAM running Android 8.0 or iOS 14.0 and above

You can learn more about Google One plans and benefits at one.google.com/about.