Monthly Archives: February 2019

Fighting disinformation across our products

Providing useful and trusted information at the scale that the Internet has reached is enormously complex and an important responsibility. Adding to that complexity, over the last several years we’ve seen organized campaigns use online platforms to deliberately spread false or misleading information.

We have twenty years of experience in these information challenges and it's what we strive to do better than anyone else. So while we have more work to do, we’ve been working hard to combat this challenge for many years.

Today at the Munich Security Conference, we presented a white paper that gives more detail about our work to tackle the intentional spread of misinformation—across Google Search, Google News, YouTube and our advertising systems. We have a significant effort dedicated to this work throughout the company, based on three foundational pillars:

  • Improve our products so they continue to make quality count;
  • Counteract malicious actors seeking to spread disinformation;
  • Give people context about the information they see.

The white paper also explains how we work beyond our products to support a healthy journalistic ecosystem, partner with civil society and researchers, and stay one step ahead of future risks.

We hope this paper and increased transparency can lead to more dialogue about what we and others can do better on these issues. We're committed to acting responsibly and thoroughly as we tackle this important challenge.

New UI tools and a richer creative canvas come to ARCore

Posted by Evan Hardesty Parker, Software Engineer

ARCore and Sceneform give developers simple yet powerful tools for creating augmented reality (AR) experiences. In our last update (version 1.6) we focused on making virtual objects appear more realistic within a scene. In version 1.7, we're focusing on creative elements like AR selfies and animation as well as helping you improve the core user experience in your apps.

Creating AR Selfies

Example of 3D face mesh application

ARCore's new Augmented Faces API (available on the front-facing camera) offers a high quality, 468-point 3D mesh that lets users attach fun effects to their faces. From animated masks, glasses, and virtual hats to skin retouching, the mesh provides coordinates and region specific anchors that make it possible to add these delightful effects.

You can get started in Unity or Sceneform by creating an ARCore session with the "front-facing camera" and Augmented Faces "mesh" mode enabled. Note that other AR features such as plane detection aren't currently available when using the front-facing camera. AugmentedFace extends Trackable, so faces are detected and updated just like planes, Augmented Images, and other trackables.

// Create ARCore session that support Augmented Faces for use in Sceneform.
public Session createAugmentedFacesSession(Activity activity) throws UnavailableException {
// Use the front-facing (selfie) camera.
Session session = new Session(activity, EnumSet.of(Session.Feature.FRONT_CAMERA));
// Enable Augmented Faces.
Config config = session.getConfig();
config.setAugmentedFaceMode(Config.AugmentedFaceMode.MESH3D);
session.configure(config);
return session;
}

Animating characters in your Sceneform AR apps

Another way version 1.7 expands the AR creative canvas is by letting your objects dance, jump, spin and move around with support for animations in Sceneform. To start an animation, initialize a ModelAnimator (an extension of the existing Android animation support) with animation data from your ModelRenderable.

void startDancing(ModelRenderable andyRenderable) {
AnimationData data = andyRenderable.getAnimationData("andy_dancing");
animator = new ModelAnimator(data, andyRenderable);
animator.start();
}

Solving common AR UX challenges in Unity with new UI components

In ARCore version 1.7 we also focused on helping you improve your user experience with a simplified workflow. We've integrated "ARCore Elements" -- a set of common AR UI components that have been validated with user testing -- into the ARCore SDK for Unity. You can use ARCore Elements to insert AR interactive patterns in your apps without having to reinvent the wheel. ARCore Elements also makes it easier to follow Google's recommended AR UX guidelines.

ARCore Elements includes two AR UI components that are especially useful:

  • Plane Finding - streamlining the key steps involved in detecting a surface
  • Object Manipulation - using intuitive gestures to rotate, elevate, move, and resize virtual objects

We plan to add more to ARCore Elements over time. You can download the ARCore Elements app available in the Google Play Store to learn more.

Improving the User Experience with Shared Camera Access

ARCore version 1.7 also includes UX enhancements for the smartphone camera -- specifically, the experience of switching in and out of AR mode. Shared Camera access in the ARCore SDK for Java lets users pause an AR experience, access the camera, and jump back in. This can be particularly helpful if users want to take a picture of the action in your app.

More details are available in the Shared Camera developer documentation and Java sample.

Learn more and get started

For AR experiences to capture users' imaginations they need to be both immersive and easily accessible. With tools for adding AR selfies, animation, and UI enhancements, ARCore version 1.7 can help with both these objectives.

You can learn more about these new updates on our ARCore developer website.

Introducing PlaNet: A Deep Planning Network for Reinforcement Learning



Research into how artificial agents can improve their decisions over time is progressing rapidly via reinforcement learning (RL). For this technique, an agent observes a stream of sensory inputs (e.g. camera images) while choosing actions (e.g. motor commands), and sometimes receives a reward for achieving a specified goal. Model-free approaches to RL aim to directly predict good actions from the sensory observations, enabling DeepMind's DQN to play Atari and other agents to control robots. However, this blackbox approach often requires several weeks of simulated interaction to learn through trial and error, limiting its usefulness in practice.

Model-based RL, in contrast, attempts to have agents learn how the world behaves in general. Instead of directly mapping observations to actions, this allows an agent to explicitly plan ahead, to more carefully select actions by "imagining" their long-term outcomes. Model-based approaches have achieved substantial successes, including AlphaGo, which imagines taking sequences of moves on a fictitious board with the known rules of the game. However, to leverage planning in unknown environments (such as controlling a robot given only pixels as input), the agent must learn the rules or dynamics from experience. Because such dynamics models in principle allow for higher efficiency and natural multi-task learning, creating models that are accurate enough for successful planning is a long-standing goal of RL.

To spur progress on this research challenge and in collaboration with DeepMind, we present the Deep Planning Network (PlaNet) agent, which learns a world model from image inputs only and successfully leverages it for planning. PlaNet solves a variety of image-based control tasks, competing with advanced model-free agents in terms of final performance while being 5000% more data efficient on average. We are additionally releasing the source code for the research community to build upon.
The PlaNet agent learning to solve a variety of continuous control tasks from images in 2000 attempts. Previous agents that do not learn a model of the environment often require 50 times as many attempts to reach comparable performance.
How PlaNet Works
In short, PlaNet learns a dynamics model given image inputs and efficiently plans with it to gather new experience. In contrast to previous methods that plan over images, we rely on a compact sequence of hidden or latent states. This is called a latent dynamics model: instead of directly predicting from one image to the next image, we predict the latent state forward. The image and reward at each step is then generated from the corresponding latent state. By compressing the images in this way, the agent can automatically learn more abstract representations, such as positions and velocities of objects, making it easier to predict forward without having to generate images along the way.
Learned Latent Dynamics Model: In a latent dynamics model, the information of the input images is integrated into the hidden states (green) using the encoder network (grey trapezoids). The hidden state is then projected forward in time to predict future images (blue trapezoids) and rewards (blue rectangle).
To learn an accurate latent dynamics model, we introduce:
  • A Recurrent State Space Model: A latent dynamics model with both deterministic and stochastic components, allowing to predict a variety of possible futures as needed for robust planning, while remembering information over many time steps. Our experiments indicate both components to be crucial for high planning performance.
  • A Latent Overshooting Objective: We generalize the standard training objective for latent dynamics models to train multi-step predictions, by enforcing consistency between one-step and multi-step predictions in latent space. This yields a fast and effective objective that improves long-term predictions and is compatible with any latent sequence model.
While predicting future images allows us teach the model, encoding and decoding images (trapezoids in the figure above) requires significant computation, which would slow down planning. However, planning in the compact latent state space is fast since we only need to predict future rewards, and not images, to evaluate an action sequence. For example, the agent can imagine how the position of a ball and its distance to the goal will change for certain actions, without having to visualize the scenario. This allows us to compare 10,000 imagined action sequences with a large batch size every time the agent chooses an action. We then execute the first action of the best sequence found and replan at the next step.
Planning in Latent Space: For planning, we encode past images (gray trapezoid) into the current hidden state (green). From there, we efficiently predict future rewards for multiple action sequences. Note how the expensive image decoder (blue trapezoid) from the previous figure is gone. We then execute the first action of the best sequence found (red box).
Compared to our preceding work on world models, PlaNet works without a policy network -- it chooses actions purely by planning, so it benefits from model improvements on the spot. For the technical details, check out our online research paper or the PDF version.

PlaNet vs. Model-Free Methods
We evaluate PlaNet on continuous control tasks. The agent is only given image observations and rewards. We consider tasks that pose a variety of different challenges:
  • A cartpole swing-up task, with a fixed camera, so the cart can move out of sight. The agent thus must absorb and remember information over multiple frames.
  • A finger spin task that requires predicting two separate objects, as well as the interactions between them.
  • A cheetah running task that includes contacts with the ground that are difficult to predict precisely, calling for a model that can predict multiple possible futures.
  • A cup task, which only provides a sparse reward signal once a ball is caught. This demands accurate predictions far into the future to plan a precise sequence of actions.
  • A walker task, in which a simulated robot starts off by lying on the ground, and must first learn to stand up and then walk.
PlaNet agents trained on a variety of image-based control tasks. The animation shows the input images as the agent is solving the tasks. The tasks pose different challenges: partial observability, contacts with the ground, sparse rewards for catching a ball, and controlling a challenging bipedal robot.
Our work constitutes one of the first examples where planning with a learned model outperforms model-free methods on image-based tasks. The table below compares PlaNet to the well-known A3C agent and the D4PG agent, that combines recent advances in model-free RL. The numbers for these baselines are taken from the DeepMind Control Suite. PlaNet clearly outperforms A3C on all tasks and reaches final performance close to D4PG while, using 5000% less interaction with the environment on average.
One Agent for All Tasks
Additionally, we train a single PlaNet agent to solve all six tasks. The agent is randomly placed into different environments without knowing the task, so it needs to infer the task from its image observations. Without changes to the hyper parameters, the multi-task agent achieves the same mean performance as individual agents. While learning slower on the cartpole tasks, it learns substantially faster and reaches a higher final performance on the challenging walker task that requires exploration.
Video predictions of the PlaNet agent trained on multiple tasks. Holdout episodes collected with the trained agent are shown above and open-loop agent hallucinations below. The agent observes the first 5 frames as context to infer the task and state and accurately predicts ahead for 50 steps given a sequence of actions.
Conclusion
Our results showcase the promise of learning dynamics models for building autonomous RL agents. We advocate for further research that focuses on learning accurate dynamics models on tasks of even higher difficulty, such as 3D environments and real-world robotics tasks. A possible ingredient for scaling up is the processing power of TPUs. We are excited about the possibilities that model-based reinforcement learning opens up, including multi-task learning, hierarchical planning and active exploration using uncertainty estimates.

Acknowledgements
This project is a collaboration with Timothy Lillicrap, Ian Fischer, Ruben Villegas, Honglak Lee, David Ha and James Davidson. We further thank everybody who commented on our paper draft and provided feedback at any point throughout the project.




Source: Google AI Blog


6 highlights from Google for Philippines

At the first ever Google for Philippines event this week, we shared our vision for how we're going to help more Filipinos make the most of what the internet has to offer. This includes key updates and product launches that we hope can drive inclusive growth and support the Filipino people to participate in an increasingly digital world: 

Connecting Filipinos to the internet  

1. Google Station. To help improve internet access, we’re bringing Google Station to the Philippines in partnership with SMART. Together, we’re making Station available at more than 50 locations, including airports in Manila, Clark and Davao, as well as LRT 2 and MRT 3 stations by the end of this month. The platform will be at hundreds more sites country-wide by the end of the year.

2. Google Go.This AI-powered “all-in-one app” helps people, especially those coming online for the first time, discover, share and find content on the internet more easily. You can tap your way through trending queries and topics, or use your voice to say what you’re looking for, and even listen to web pages being read out-loud. Google Go is tailor-made for devices which may have less space or less reliable internet connections, with search results on the app optimized to save up to 40% data. 

Google Go G4PH

Providing relevant and localized experiences for Filipinos 

3. Jobs on Google Search. Filipino job seekers will soon be able to find job listings from sites across the web directly in Google Search as we’re bringing jobs on Google Search to the Philippines. They’ll be able to customize their job search through filters, save searches, or be notified when new relevant job postings appear. At launch, this will include half a million job listings from sites such as Department of Labor and Employment, Kalibrr, Jobayan and Jobs Cloud. To ensure that even more jobs are listed over time, we’ve published open documentation so all third-party job search platforms and direct employers can make their job openings discoverable through jobs on Google Search. 

4. Number Coding in Google Maps. Developed in partnership with Metropolitan Manila Development Authority (MMDA), this new feature will help drivers navigate from A to B in a way that avoids restricted roads on their coding day.

5. Digiskarteng Pinay. YouTube has always been a platform for learning. In collaboration with TESDA, Philippine Commission on Women, Cashalo and Unilever, this program will empower women by connecting them with educational content on YouTube—from health to nutrition, financial literacy and technical skills—that can support them in enhancing their livelihoods. 


Kilalanin si Jhoan: Isang Ma-Digiskarteng Pinay

Enabling MSMEs to connect with customers online

6. Making MSMEs more discoverable in partnership with PLDT Enterprise. Working with PLDT Enterprise, we’ll help businesses verify their business profiles and support them to get their businesses listed on Google My Business, a free and easy-to-use tool for businesses to manage their online presence across Google Search and Maps. With searches for products, stores and services “near me” doubling in the last 3 years, we believe this is an incredible opportunity for Filipino businesses to reach new customers. 

Beta Channel Update for Desktop

The beta channel has been updated to 73.0.3683.39 for Windows, Mac, and, Linux.


A full list of changes in this build is available in the log. Interested in switching release channels?  Find out how here. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.



Abdul Syed
Google Chrome

Change to per-conversation settings in classic Hangouts on Android

Quick launch summary

We're moving notification settings for classic Hangouts conversations to notification channels on O+ Android devices. With these changes, customized ring tones and chat messaging tones per conversation/contact will be removed and defaulted to the general app settings.

New settings page

To view or edit the ring and chat messaging tone settings for classic Hangouts on your Android device, open the classic Hangouts app and, in a conversation, click the more menu > “Options.”

Availability

Rollout details



G Suite editions

  • All G Suite editions


Stay up to date with G Suite launches

Smart regulation for combating illegal content

We've written before about how we're working to support smart regulation, and one area of increasing attention is regulation to combat illegal content.

As online platforms have become increasingly popular, there’s been a rich debate about the best legal framework for combating illegal content in a way that respects other social values, like free expression, diversity and innovation. Today, various laws provide detailed regulations, including Section 230 of the Communications Decency Act in the United States and European Union’s e-Commerce Directive.

Google invests millions of dollars in technology and people to combat illegal content in an effective and fair way. It’s a complex task, and–just as in offline contexts—it’s not a problem that can be totally solved. Rather, it’s a problem that must be managed, and we are constantly refining our practices.

In addressing illegal content, we’re also conscious of the importance of protecting legal speech. Context often matters when determining whether content is illegal. Consider a video of military conflict. In one context the footage might be documentary evidence of atrocities in areas where journalists have great difficulty and danger accessing. In another context the footage could be promotional material for an illegal organization. Even a highly trained reviewer could have a hard time telling the difference, and we need to get those decisions right across many different languages and cultures, and across the vast scale of audio, video, text, and images uploaded online. We make it easy to easily submit takedown notices; at the same time, we also create checks and balances against misuse of removal processes. And we look to the work of international agencies and principles from leading groups like the Global Network Initiative.

A smart regulatory framework is essential to enabling an appropriate approach to illegal content. We wanted to share four key principles that inform our practices and that (we would suggest) make for an effective regulatory framework:

  • Shared Responsibility: Tackling illegal content is a societal challenge—in which companies, governments, civil society, and users all have a role to play. Whether a company is alleging copyright infringement, an individual is claiming defamation, or a government is seeking removal of terrorist content, it’s essential to provide clear notice about the specific piece of content to an online platform, and then platforms have a responsibility to take appropriate action on the specific content. In some cases, content may not be clearly illegal, either because the facts are uncertain or because the legal outcome depends on a difficult balancing act; in turn, courts have an essential role to play in fact-finding and reaching legal conclusions on which platforms can rely.

  • Rule of law and creating legal clarity: It’s important to clearly define what platforms can do to fulfill their legal responsibilities, including removal obligations. An online platform that takes other voluntary steps to address illegal content should not be penalized. (This is sometimes called “Good Samaritan” protection.)

  • Flexibility to accommodate new technology:While laws should accommodate relevant differences between platforms, given the fast-evolving nature of the sector, laws should be written in ways that address the underlying issue rather than focusing on existing technologies or mandating specific technological fixes. 

  • Fairness and transparency: Laws should support companies’ ability to publish transparency reports about content removals, and provide people with notice and an ability to appeal removal of content. They should also recognize that fairness is a flexible and context-dependent notion—for example, improperly blocking newsworthy content or political expression could cause more harm than mistakenly blocking other types of content. 

With these principles in mind, we support refinement of notice-and-takedown regimes, but we have significant concerns about laws that would mandate proactively monitoring or filtering content, impose overly rigid timelines for content removal, or otherwise impose harsh penalties even on those acting in good faith. These types of laws create a risk that platforms won’t take a balanced approach to content removals, but instead take a “better safe than sorry” approach—blocking content at upload or implementing a “take down first, ask questions later (or never)” approach. We regularly receive overly broad removal requests, and analyses of cease-and-desist and takedown letters have found that many seek to remove potentially legitimate or protected speech.

There’s ample room for debate and nuance on these topics—we discuss them every day—and we’ll continue to seek ongoing collaboration among governments, industry, and civil society on this front. Over time, an ecosystem of tools and institutions—like the Global Internet Forum to Counter Terrorism, and the Internet Watch Foundation, which has taken down child sexual abuse material for more than two decades—has evolved to address the issue. Continuing to develop initiatives like these and other multistakeholder efforts remains critical, and we look forward to progressing those discussions.

Roses are red, violets are blue: six Pixel camera tips, just for you

No matter what your plans are this Valentine’s Day, you’ll probably end up taking a few photos to celebrate or capture the moment—and that's where Pixel's camera comes in. Pixel 3's camera has tools that can help you capture and get creative with your V-Day photos. Here are six tips for our beloved #teampixel.

1. Virtual Valentines

Playground is a creative mode in the Pixel camera that helps you create and play with the world around you. You can send a virtual Valentine, or make your photos and videos stand out with the new Love Playmoji pack and two sticker packs. Capture and celebrate the love in the air today and year-round with interactive hearts, fancy champagne glasses, animated love notes or lovebirds.


2. A V-Day Vision

Your Valentine always stands out to you. So make them the center of focus with Portrait Mode, and watch as the background fades into a beautiful blur… just like the world does when you’re together.

3. Mood Lighting

Romantic dinner date? Use Night Sight to capture the mood when the lights are dim. Pixel’s camera can capture the highlights of your Valentine’s celebrations, even in low light.

Night Sight

4. Picture Perfect Palentines

If you’re celebrating with your Palentines, Group Selfie Cam on Pixel 3 gives everyone the love they deserve in your group selfie.

5. Search at First Sight

The technology that lets you search what you see is baked right into Pixel 3’s camera. See a shirt that would look great on your Galentine? Use Google Lens to find something similar online. Want to know what that flower is in your bouquet? Use Google Lens to identify it. Making a last-minute dinner reservation at that restaurant on the corner? Use Google Lens suggestions to dial the number on their sign with just a tap in the Pixel camera.

Lens

6. Sharing is Caring

With unlimited original quality photo and video storage using Google Photos on Pixel 3, you can snap as many shots as you want. From there, you can turn them into a movie or set up a live album, so you can relive (and share) your favorite Palentines’ moments year-round with your friends.

So whether you’re celebrating Valentine’s, Palentine’s or Galentine’s day, Pixel 3’s camera can help you capture your favorite moments with your favorite people.

Lost in translation? Try interpreter mode with the Google Assistant

It’s easier than ever to meet new people and explore new places—but language barriers that prevent us from talking to each other still exist. With the Google Assistant, we're focused on creating the best way to get things done—regardless of who you’re communicating with or what language you speak. To help you connect with people you’re talking to, we recently introduced a new feature called interpreter mode that translates your conversations in real time.

We’ve been piloting interpreter mode at the concierge and front desks of hotels like Caesars Palace in Las Vegas, Dream Downtown in New York City and Hyatt Regency San Francisco Airport, where guests are using it to have free-flowing conversations with hotel staff—even if they don’t speak the same language.

If you want to test out interpreter mode but don’t have an upcoming stay at these hotels—don’t worry! You can now use this feature to translate across 26 languages from the comfort of your own home. Give it a go on Google Home devices and Smart Displays, where you’ll both see and hear the translated conversation. Simply ask your Assistant, “Hey Google, be my Thai interpreter” or “Hey Google, help me speak Spanish” to get started.

To give you an idea on how people from around the world have been using interpreter mode, let’s check out how travelers have used it at hotels to experience and navigate new destinations.

hyatt1

Caesars Palace Las Vegas hosts thousands of guests each year from across the globe, and interpreter mode brings simpler, faster and more effective translation capabilities directly to their guests. Previously, if the concierge staff at Caesars Palace needed to help a non-English speaking guest, they’d have to dial their in-house translation service and pass the phone back and forth with them. Now, with interpreter mode on the Google Home Hub, concierge staff can personally provide guest recommendations in real time—leading to better service, plus quicker and easier guest transactions. In a city known for its entertainment and cuisine, guests are using the feature in languages like Spanish, Portuguese and Italian to speak with the concierge staff for help booking concerts and live theatrical performances, securing restaurant reservations, and getting directions around the Las Vegas Strip.

At Dream Downtown, the technology helps guests find exactly what they're looking for when they turn to the concierge staff for assistance. Spanish, Mandarin and French have been the three most popular languages translated, and patrons are using interpreter mode when they need to do things like check into their rooms or request amenities like towels or ice. During New York Fashion Week, a guest from France urgently needed to find supplies to complete a design project—but she only spoke a little bit of English. Using interpreter mode, she was able to translate the word “tape measure” for Dream staff, who then helped her find one at a store nearby. The guest was pleasantly surprised at how quickly she was able to get assistance—and she finished her design, right on time for the fashion show.

At Hyatt Regency San Francisco Airport, where the concierge team welcome numerous international guests due to the hotel’s proximity to the airport, Korean, Japanese, and Mandarin have been the top translated languages. Guests typically use the feature to get help with questions about San Francisco landmarks, tourist destinations and to discover restaurants nearby. Most recently, a Korean guest used interpreter mode to help plan out his first trip to San Francisco. Without interpreter mode, he wouldn’t have been able to take advantage of local recommendations from the concierge.

interpretermodegif

Interpreter mode can help businesses better serve their guests through an improved customer experience. And this technology can be a helping hand wherever language barriers exist, including at hotels, airports, restaurants, customer service kiosks, organizations aiding humanitarian efforts and much more. If you’re part of a business interested in bringing this technology to your customers, we’d love to hear from you.

Stable Channel Update for Desktop

The stable channel has been updated to 72.0.3626.109 for Windows, Mac, and Linux, which will roll out over the coming days/weeks.


A list of all changes is available in the log. Interested in switching release channels? Find out how. If you find a new issue, please let us know by filing a bug. The community help forum is also a great place to reach out for help or learn about common issues.

Abdul Syed
Google Chrome