Author Archives: Google Developers

How it’s Made: Interacting with Gemini through multimodal prompting

Posted by Alexander Chen, Creative Director


Let’s try an experiment. We’ll show this picture to our multimodal model Gemini and ask it to describe what it sees:

A close up of a hand with five fingers spread out, palm facing down on a wooden surface.
Tell me what you see
Gemini:  I see a person's right hand. The hand is open with the fingers spread apart.

Ok, let’s try this one:

A closeup of the same hand, now balled up in a fist.
Gemini: A person knocking on a wooden door.

How about this one?

A closeup of the same hand, with two fingers extended like that of a peace sign.
Gemini: I see a hand with two fingers extended, which is a common symbol for the number two.

But wait … what if we asked Gemini to reason about all of these images together?

A sequence of three frames depicting the same hand in different formations. The first frame shows the hand with all five fingers spread out, palm facing down. The second frame shows the hand balled up in a fist. The third frame shows the hand with two fingers extended, like that of a peace sign.
What do you think I'm doing? Hint: it's a game.
Gemini: You're playing rock, paper, scissors.

Nice. What else could we try? Let's see if Gemini can notice this pattern:

A sequence of six frames that alternate between a hand that is balled up in a fist, and a hand that has two fingers extended.
I’m playing rock paper scissors. Do you notice a pattern in my gameplay?
Gemini: You are alternating between rock and scissors.

Neat, it spotted the pattern! But does Gemini really know how this game works? Let’s try asking:

Hmm ... do you think it's a good strategy?
Gemini: No, it is not a good strategy. Your opponent will quickly learn your pattern and will be able to win every time. You should try to randomize your choices more.

Good advice, Gemini! Okay, for this last one, let’s put your multimodal reasoning capabilities to the test. Here’s your prompt:

This is a secret message. What does it say?

A six by eight grid of frames of a hands balled up in a fist or the hand with all five fingers spread out, palm facing down. The series of photos spells 'Gemini' in ASCII binary.

Can you figure it out? We have gotten Gemini to solve it, but we won’t spoil it by putting the answer in this post. 🙂



What is multimodal prompting?

Everything we did just now is an example of “multimodal prompting.” We’re basically giving Gemini combinations of different modalities — image and text in this case — and having Gemini respond by predicting what might come next. Gemini’s ability to seamlessly combine these modes together enables new possibilities for what you can do.

In this post, we’ll explore some of the prompting approaches we used in our Hands on with Gemini demo video. We’ll soon be rolling out Gemini for people to try in Google AI Studio, our free, web-based developer tool where you’ll be able to try your own multimodal prompts with Gemini. We’re hoping this guide of starter prompts and ideas helps inspire you to start exploring your own ideas.



🧮 Spatial reasoning and logic

Logic and puzzles are fun way to test Gemini. Let’s try one that requires both left-to-right spatial reasoning and knowledge about our solar system:

Three sticky notes on a wooden surface, each featuring a hand-drawn sketch of a solar-body. The left-most note depicts a drawing of the sun. The middle note has a drawing of saturn. The right-most note shows a drawing of earth.
Is this the right order? Consider the distance from the sun and explain your reasoning.

Gemini: No, the correct order is Sun, Earth, Saturn. The Sun is the closest to the center of the solar system, followed by the Earth, and then Saturn.

Nice! Here’s another, inspired by pinewood derby challenges:

Two sticky notes depicting two different hand-drawn cars going down-hill. The car on the left is blocky and awkwardly shaped, the car on the right is slanted and smooth.
Which of these cars is more aerodynamic? The one on the left or the right? Explain why, using specific visual details.

Gemini: The car on the right is more aerodynamic. It has a lower profile and a more streamlined shape. The car on the left has a higher profile and a more boxy shape, which makes it less aerodynamic.



📷 Image sequences

Next, what if we showed Gemini a sequence of images? Let’s see we can show a few still frames from a game of charades and have Gemini guess the movie:

Four frames from a video of a person leaning back and moving their arms wildly.
Guess what movie I'm acting out.
Gemini: The Matrix
Nice! But which part specifically? Look at my body movements.
Gemini: The part where Neo dodges bullets.



🪄 Magic tricks

Exploring image sequences really lets you dig into how Gemini reasons over time. Let’s try a classic magic trick.

An image of two hands with palms facing the camera, there is a coin in the right hand.
Explain what I'm doing.
Gemini: You are holding a coin in your right hand.
An image of two hands in motion, palms turning inward and away from the camera.
An image of two hands, palms facing down and away from the camera.
Where do you think the coin is?
Gemini: The coin is in your right hand.

Sure, that’s a logical answer given what it’s seen. But let’s pull a sleight of hand as we reveal this next image to Gemini...

An image of two hands, the left palm is facing down and the right hand is tilted upwards revealing nothing under the right palm.
Look here, what happened?
Gemini: The coin is gone.

Gemini notices it’s gone. But can it summarize what may have happened?

Summarize what may have happened, step by step, starting with the first image.

Gemini: 
1. You started with a coin in your right hand.
2. You then made a motion with your hands, and the coin disappeared.
3. You then showed that the coin was gone.

Gemini is leveraging the entire context of this conversation – both text and image – to summarize its answer. Finally, let’s ask:

Can you explain how it disappeared?
Gemini: The coin was palmed.

Gemini makes a really solid guess! But we won’t reveal our secret 🙂.



🔀 Cup shuffling

Let’s try a trick that requires memory and logic: the classic ball and cup shuffling game. For this one, we’ll break it down into steps for Gemini. First, let’s establish the basics with these instructions:

Instructions: There are three cups in position 1 (left), position 2 (middle), position 3 (right).

Next, we’ll need to tell Gemini where we’re placing the ball:

Starting position: The ball is in position 3.
Current state: empty, empty, ball

Notice how we came up with a simple format for remembering where the ball is with “empty, empty, ball.” Next, let’s show Gemini two example turns:

Images:
Three frames depicting a person shuffling three green cups. The person switches positions of the middle and right-most cups. 
Swap: 2 and 3
Current state: empty, ball, empty

Images:
Three frames depicting a person shuffling three green cups. The person switches positions of the middle and left-most cups. 
Swap: 1 and 2
Current state: ball, empty, empty

In these two example turns, we’re both showing it what a swap looks like, and how to update the ball position. Now, let’s see if Gemini can generalize to a swap it hasn't seen:

Images:
Three frames depicting a person shuffling three green cups. The person switches positions of the left-most and right-most cups. 
Swap: 

Here's Gemini's response:

1 and 3
Current state: empty, empty, ball

Gemini got it! It looked at these images and correctly inferred that cups 1 and 3 are being swapped. And it reasoned correctly about how to update the ball position. Let’s ask:

Tell me where the ball is and summarize how it got there, in paragraph form.

Gemini: The ball is in cup 3. 

Starting with the ball in cup 3, the first swap was cup 2 and 3. This put the ball in cup 2. The second swap was cup 1 and 2. This put the ball in cup 1. The third swap was cup 1 and 3. This put the ball in cup 3.

Not only did Gemini get the answer correctly, it accurately summarized the game history. Of course, it won’t always get this challenge right. Sometimes the fake out move (where you swap two empty cups) seems to trip it up, but sometimes it gets that too. But simple prompts like this make it really fun to rapidly test Gemini. You can change the variables in your prompt, including the order of swaps, and see how it does.



🔨 Tool use

If you want to use Gemini in your own apps, you’ll want it to be able to connect to other tools. Let’s try a simple idea where Gemini needs to combine multimodality with tool use: drawing a picture to search for music.

A musical instruments and palm tree sketch on a notebook. The sketch features a guitar, an amplifier, a drum set, and a palm tree drawn in a loose and expressive style. The notebook sits on a wooden desk.
Step 1. Look at this drawing, describe everything in it, and come up with a fun, specific idea for a good musical soundtrack genre, mood, or soundtrack. 
Step 2. Provide a search query in the form: "Search [insert query]"

Gemini: 
The drawing is of a guitar, drums, and an amplifier. There is also a palm tree. The mood is fun, upbeat, and tropical. The genre is reggae. The soundtrack could be used for a movie or TV show set in a tropical location. 

Search "Reggae music for a tropical vacation"

Nice! Gemini both reasons about what it sees and then generates a search query you can parse to do a search. It’s like Gemini is acting like a translator for you – but instead of translating between languages, it’s translating modalities – from drawing to music in this case. With multimodal prompting, you can use Gemini to invent your own entirely new translations between different inputs and outputs.



🕹️Game creation

What if we tried using Gemini to quickly prototype a multimodal game? Here’s an idea: a geography guessing game where you have to point at a map to make your guess. Let’s start by prompting Gemini with the core idea:

Instructions: Let's play a game. Think of a country and give me a clue. The clue must be specific enough that there is only one correct country. I will try pointing at the country on a map.

Next, let’s give Gemini an example turn of gameplay, showing it how we want it to handle both incorrect and correct answers:

Country: This country is huge island that is mostly covered in ice.
(Answer: Greenland) 

Guess:
An image of a world map, a hand icon with the index finger pointing at Canada
I am pointing at the country of Canada *incorrect*

Guess:
An image of a world map, a hand icon with the index finger pointing at Greenland
I am pointing at the country of Greenland *correct*

Let’s give it a go and prompt Gemini to generate a clue:

Clue:
This country is known for its unique wildlife, including kangaroos and koalas. (Answer: Australia)

Ok, that’s a good clue. Let’s test out whether pointing will work. Just for fun, let’s try pointing at the wrong place first:

Guess:
A person pointing at Brazil on a map of the world with their right index finger. The map includes blue oceans, and green continents with no country borders.
I am pointing at the country of Brazil *incorrect*

Great! Gemini looked at my image and figured out I’m pointing at Brazil, and correctly reasoned that’s wrong. Now let’s point at the right place on the map:

A person pointing at Brazil on a map of the world with their right index finger. The map includes blue oceans, and green continents with no country borders.
I am pointing at the country of Australia *correct*

Nice! We’ve basically taught Gemini our game logic just by giving it an example. You'll also notice that it generalized from the illustrated hand in the examples.



⌨️ Coding

Of course, to bring your game idea to life, you’ll eventually have to write some executable code. Let’s see if Gemini can make a simple countdown timer for a game, but with a few fun twists:

Implement a simple timer in HTML/CSS/Javascript. Use a sans serif font and dark mode. Start it at 10 seconds and start counting down. When it reaches zero, replace the timer with a random emoji that is associated with excitement and motivation! Then go back to the timer at 10 seconds and start counting down again.

With just this single instruction, Gemini gives us a working timer that does what we asked for:

An animated gif of a countdown timer starting from 10. At the end of the countdown, a rocket emoji is shown followed lightning bolt emoji and a confetti emoji.

My favorite part is scrolling through Gemini’s source code to find the array of motivational emojis it picked for me:

 const emojis  = ['🚀', '⚡️', '🎉', '🎊', '🥳', '🤩', '✨'];



👀 A sneak peek

Throughout this post, we’ve been giving Gemini an input, and having Gemini make predictions for what might come next. That's basically what prompting is. And our inputs have been multimodal – image and text, combined.

But so far we've only shown Gemini responding in text. Maybe you’re wondering, can Gemini also respond with a combination of image and text? It can! This is a capability of Gemini called “interleaved text and image generation.” While this feature won’t be ready in the first version of Gemini for people to try, we hope to roll it out soon. Here’s a sneak peek of what’s possible.

Let’s see if we could use Gemini to provide everyday creative inspiration. And let’s try it in a domain that requires a bit of multimodal reasoning ... knitting! 🧶. Similar to our map game above, let's provide one example turn of interaction:

Give me three ideas for crochet creations I could make with these two colors:

Balls of blue and yellow yarn on a wooden table.

Colors: I see blue and yellow yarn.
Idea 1: How about a cute dog with one yellow ear? 
Image 1:

A blue and yellow crocheted dog head with ears placed on a wooden table.

Idea 2: Or how about a jellyfish with a few yellow tentacles?
Image 2:

A blue and yellow crocheted octopus with tentacles placed on a wooden table.

Idea 3: Or how about a blue cat with a yellow nose?
Image 3:

A blue and yellow crocheted cat with whiskers placed on a wooden table.

We’re essentially teaching Gemini about how we want each interaction to go: “I’ll take a photo of two balls of yarn, and I expect you (Gemini) to both come up with an idea for something I could make, and generate an image of it.”

Now, let’s show it a new pair of yarn colors it hasn't yet seen, and see if it can generalize:

Give me three ideas for crochet creations I could make with these two colors:

Balls of blue and pink yarn on a wooden table.

Gemini:
Colors: I see blue and pink yarn.
Idea 1: How about a pig with blue ears?
Image 1:

A blue and pink crocheted pig face with ears placed on a wooden table.

Idea 2: Or an octopus with pink and blue tentacles?
Image 2:

A blue and yellow crocheted octopus with tentacles placed on a wooden table.

Idea 3: Or a bunny with a pink nose?
Image 3:

A blue and pink crocheted bunny placed on a wooden table.

Nice! Gemini correctly reasoned about the new colors (“I see blue and pink yarn”) and generated these ideas and the images in a single, interleaved output of text and image.

What Gemini did here is fundamentally different from today’s text-to-image models. It's not just passing an instruction to a separate text-to-image model. It sees the image of my actual yarn on my wooden table, truly doing multimodal reasoning about my text and image together.


What's Next?

We hope you found this a helpful starter guide to get a sense of what’s possible with Gemini. We’re very excited to roll it out to more people soon so you can explore your own ideas through prompting. Stay tuned!

Global Google Developer Experts Share Their Favorite Tools and Advice for New Developers

Posted by Lyanne Alfaro, DevRel Program Manager, Google Developer Studio

Developer Journey is a monthly series highlighting diverse and global developers sharing relatable challenges, opportunities, and wins in their journey. Every month, we will spotlight developers around the world, the Google tools they leverage, and the kinds of products they are building.

This month we speak with global Google Developer Experts in Firebase, Women Techmakers, and beyond, to learn more about their favorite Google tools, the applications they’ve built to serve diverse communities, and their best advice for anyone just getting started as a developer.

Juan Lombana

Headshot of Juan Lombana, smiling
Mexico City, Mexico
Founder, Mercatitlán

What Google tools have you used to build?

Google Analytics and Firebase's A/B testing features have been pivotal in our data-driven approach, enabling continuous improvement in our conversion strategies. More recently, Bard has become a significant asset in developing new products and in our educational endeavors, especially with the introduction of our AI course. Its utility in both product development and educational settings is profound.


Which tool has been your favorite to use? Why?

If I had to choose, it would be Google Ads. Its ability to consistently drive new customers and provide unparalleled visibility to quality products is unmatched. While it may not traditionally be considered a 'tool' in the strictest sense, its impact on business growth and visibility is indisputable.


Please share with us about something you’ve built in the past using Google tools.

My entire business, Mercatitlán, has been built and scaled using Google Tools. We have cultivated a community of over 40,000 paid students, educating them on effective use of Google Ads, leveraging Bard for enhanced website content, and employing Google Analytics for strategic A/B testing to boost sales. The transformational impact of these tools on both my business and my students' ventures is a testament to their potential.


What will you create with Google Bard?

The integration of Bard AI into our daily operations is revolutionizing the way we approach digital marketing. Beyond its current uses in social media content creation, ad ideas generation, email composition, and customer support enhancement, we're exploring several innovative applications:

  • Personalized Marketing Campaigns: Using Bard AI, we can analyze customer data and preferences to create highly personalized marketing campaigns. This helps in delivering more relevant content to our audience, thereby increasing engagement and conversion rates. 
  • Competitive Analysis: By analyzing competitor data, Bard AI can help us understand their strategies, strengths, and weaknesses. This intelligence is crucial for refining our marketing approach and differentiating our brand in the marketplace.
  • Content Optimization for SEO: Bard can assist in optimizing website and blog content for search engines. By understanding and integrating key SEO principles, it can help us rank higher in search results, thus improving our online visibility. 
  • Automated Reporting and Insights: Automating the generation of marketing reports and insights with Bard saves time and resources, allowing our team to focus on strategy and creativity rather than manual data analysis.

What advice would you give someone starting in their developer journey?

The key is to start with action rather than waiting for perfection. Adopt a mindset focused on experimentation and analytics. This approach allows you to follow data-driven insights rather than solely relying on innovation, leading to significant societal impact through technology.


Jirawat Karanwittayakarn

Headshot of Jirawat Karanwittayakarn, smiling
Bangkok, Thailand
Tech Evangelist, LINE Thailand

What Google tools have you used to build?

I have used a variety of Firebase services to build LINE chatbots for a number of years. These services have included Cloud Functions, Cloud Firestore, Cloud Storage, Firebase Hosting, and etc. Recently I have also used the PaLM API, a very powerful tool that allows me to build Generative AI chatbots.


Which tool has been your favorite to use? Why?

Firebase is my favorite tool because it is a platform that provides a complete set of tools for building and managing mobile, web, and chatbots. It is very easy to use and has a wide range of features that make it a great choice for developers of all levels. Furthermore, Firebase services have allowed me to scale my chatbots and make them more reliable.


Please share with us about something you’ve built in the past using Google tools.

  • LINE Developers TH is a chatbot that allows Thai developers to learn about LINE APIs and get started with building services. It also provides users with the ability to try out demos of LINE APIs.
  • TrueMoney is a wallet app that I have built in the past using Firebase. The app allows users to store money, send money, and pay bills. It is a very popular app in Thailand, with over 10 million users.
  • Sanook is an app that allows users to access news, articles, and other content from the number one web portal in Thailand on their mobile devices.

What will you create with Google Bard?

I would like to create a use case of building a powerful LINE chatbot using PaLM API and Firebase for developers. I believe this will be a great way to showcase the power of these tools and how they can be used to create innovative solutions.


What advice would you give someone starting in their developer journey?

First and foremost, I would encourage them to be curious and always be willing to learn new things. The world of technology is constantly changing, so it's important to stay up-to-date on the latest trends and technologies. This can be done by reading articles, attending conferences, and taking online courses.

Secondly, I would recommend that they find a mentor or role model who can help guide them on their journey. Having someone who has been through the process can be invaluable in providing support and advice. They can help you identify areas where you need to improve, and provide you with tips and tricks for success.

Finally, I would encourage them to never give up. The road to becoming a developer can be challenging, but it's also incredibly rewarding. If you're passionate about technology, then don't let anything stop you from pursuing your dreams.


Laura Morinigo

Headshot of Lauren Moringo, smiling
London, England
Women Techmakers Ambassador
Principal Engineer and Consultant, Samsung Electronics UK

What Google tools have you used to build?

I have used tools like Google Cloud and Firebase.


Which tool has been your favorite to use? Why?

I would say Firebase! It helped me to build web apps and explore new technologies easily while saving a lot of time and resources. Additionally, a lot of functionalities have been added recently. Over the years, I've witnessed its evolution, with the addition of numerous functionalities that continually enhance its utility and user experience. This constant innovation within Firebase not only simplifies complex tasks but also opens doors to creative possibilities in web app development.


Please share with us about something you’ve built in the past using Google tools.

I've been leading a project in partnership with the United Nations to help share information about its worldwide global goals. We used Firebase hosting and Cloud functions for the first release of the web app and it was a success! It felt very good to help create tools that support a good cause.


What will you create with Google Bard?

I'm experimenting with the current extensions to improve personal productivity. It's very interesting how you can improve the way that you do your daily tasks.


What advice would you give someone starting in their developer journey?

Remember that as a developer you will have the power to create! Use this power to build personal projects and combine it with things that you enjoy. You will start building a portfolio and have fun while learning. Finally, don't hesitate to find a mentor and connect with a community of developers to support and guidance in your journey. You can find a lot of help, improve your networking, and even have friends for life!

Women in ML Symposium 2023: Meet the presenters



Posted by Sharbani Roy – Senior Director, Product Management, Google

We’re back with the third annual Women in Machine Learning Symposium on December 7, 2023

Join us virtually from 9:30 am to 1:00 pm PT for an immersive and insightful set of deep dives for every level of Machine Learning experience.

The Women in ML Symposium is an inclusive event for anyone passionate about the transformative fields of Machine Learning (ML) and Artificial Intelligence (AI). Meet this year’s women in ML as they uncover practical applications across multiple industries and discuss the latest advancements in frameworks, generative AI, and more.


Joana Carrasqueira, presenter for “Enabling Anyone to Build with Google AI”

Joana is a Developer Relations Lead for AI/ML at Google and her mission is to empower individuals and organizations to harness the power of AI to address real-world challenges.

She is a business leader with a track record of bringing strategic vision and global cross-functional programs to life. She’s also the creator of Google’s Women in ML program and flagship symposium, a pioneering initiative that has equipped thousands of developers with knowledge and skills in AI/ML.

Prior to Google, she worked at the Silicon Valley Innovation Center on innovation consulting for Forbes top500, startups and Venture Capital firms. Served as Education Manager at the International Pharmaceutical Federation, working closely with WHO, UNESCO, the United Nations and started her career at the Portuguese Pharmaceutical Society.

Joana holds an MBA from IE Business School, a Master in Pharmaceutical Sciences and a Leadership Certificate from U.C. Berkeley in California.



Sharbani Roy, presenter for “What’s New in Machine Learning?”

Sharbani is Sr. Director in Google’s Core Machine Learning group.

Before joining Google, Sharbani led engineering and product teams in Amazon Alexa, focused on media streaming, real-time communication, and applied ML (e.g., NLU, CV, and AR) for 1P/3P developers and end consumers.

Sharbani holds degrees in physics and mathematics from the University of Chicago and an MBA from Stanford University, and lives in Seattle with her husband and three children.



Eve Phillips, presenter for “Future of Frameworks: Navigate the OSS Landscape"

Eve is a Director of Product Management at Google.

Currently, Eve leads the ML Frameworks product team, which includes responsibility for TensorFlow, JAX and Keras. Previously, she led product teams within Google for Clinicians and ChromeOS. Prior to Google, she served as CEO of Empower Interactive, delivering tech-enabled behavioral health.

Earlier, she held roles in leading technology companies and investors including Trilogy, Microsoft, and Greylock.

Eve earned a BS and M.Eng in EECS from MIT and an MBA from Stanford.



Meenu Gaba, presenter for “Data-Centric AI: A New Paradigm"

Meenu leads the Machine Learning infrastructure team at Google, with a mission to power AI innovation with world-class ML infrastructure and services.

She is a technology leader with years of experience launching new products and growing small teams into mature scalable, multi-tiered organizations that are poised to deliver high quality products. Meenu enjoys fast-paced, dynamic, highly iterative/innovative environments and has lots of experience in balancing these disciplines while fostering a people-first culture and forming solid grounds for cross-functional relationships.

Meenu holds a Master's degree in Computer Science. In her free time, she enjoys hiking, solving crosswords, and watching movies.



Kelly Shaefer, presenter for “Maximize Your Data Exploration”

Kelly leads product teams at Google Labs, building both entirely new AI products and AI-enabled features into Google's largest existing products.

In the past, she led the Growth team for Google Workspace, including Gmail, Drive, Docs, and many more.

Outside of Google, she led the Enterprise product team at Stripe and was the P&L owner for Stripe's multi-billion dollar Payments area.

Kelly has an undergraduate degree from Wharton at UPenn, and an MBA from Harvard Business School.



Divyashree Sreepathihalli, presenter for “Keras: Shortcut to AI Mastery”

Divya is a talented machine learning software engineer who is currently a part of the Keras team at Google.

In this role, she specializes in developing Keras core modeling APIs and KerasCV to improve the functionality of the software.

Prior to joining Google, Divya worked as a Deep Learning Scientist for Zazu Sensor, a startup group in Intel's Emerging Growth Incubation (EGI) group. Her work there focused on computer vision and deep learning algorithm development for object detection and tracking, resulting in significant advancements for the startup.

Divya completed her Masters in Computer Engineering from Texas A&M University where she focused on Artificial intelligence in 2017.



Na Li, presenter for “Prototype ML with Visual Blocks”

Na Li is a software engineer manager from Google CoreML.

She leads a team to build developer tools to support ML development journey, from prototyping to model visualization and benchmarking.

Prior to Google, she was a research scientist at Harvard, working in HCI domain.

Throughout her career, Na strives to make ML accessible for everyone.



Zoe Wang, presenter for “Deploying ML Models to Mobile Devices”

Zoe is a technical program manager at Google.

Her career has been focused on Machine Learning (ML) productionization.

Currently she works with her team bringing ML models to mobile devices that power some of AI features for Pixel and other edge devices.

Prior to Google, Zoe worked at Meta on ML Platforms for end-to-end ML lifecycles.



Yvonne Li, presenter for “New GenAI Products and Solutions on Google Cloud”

Yvonne Li is a software engineer on the Duet Platform team at Google, where she focuses on improving the quality of generative AI models.

As a machine learning engineer and developer advocate at IBM, she designed and developed language models and curated open source datasets.

She has over 3 years of experience in the big tech industry, and is passionate about using machine learning to solve real-world problems.

Yvonne is the author of two Coursera courses: Data Analysis with R, and, Data Visualization with R.



Nithya Natesan, presenter for “AI-powered Infrastructure: Cloud TPUs”

Nithya Natesan is a Group Product Manager in the Cloud ML Accelerators team focussing on GPU / TPU offerings for Google Cloud.

Prior to Google, she was head of product management at NVIDIA, launching several products like DGX Cloud, Base Command Platform.

She has ~14 years of experience in hyper convergence Data Center software products, with recent focus on ML / AI Infra and Platform products. She is passionate about building rock solid PM teams, and shipping high quality usable ML / AI products.

Nithya has also won industry accolades namely WomenImpactTech 2023.



Andrada Vulpe, presenter for “Community Matters: 8 Reasons Why You Should Be Involved with Kaggle”

Andrada is a Data Scientist at Endava, a Notebooks Grandmaster on Kaggle, a Dev Expert at Weights and Biases and a proud Z by HP Data Science Global Ambassador.

She is highly passionate about Python, R, Machine and Deep Learning, powerful visualizations and everything in between.

Andrada finished her MSc in Data Science and Analytics in the UK and won 2 Kaggle Analytics competitions.



Jeehae Lee, presenter for “From Recovering Pro Golfer to AI Entrepreneur”

Jeehae Lee is a golf industry executive who has worked to create and build transformational sports technology businesses.

As the Co-Founder & CEO of Sportsbox AI, Jeehae is currently developing products using AI-enabled 3D motion analysis technology that will help participants of various sports and fitness activities learn and improve their skills.

Before founding Sportsbox, she spent five years between 2015 and 2020 at Topgolf Entertainment Group, leading strategy and new business development for various divisions including Toptracer. Between 2012 and 2013, she was at global sports and entertainment marketing agency, IMG, representing professional golfer icon Michelle Wie West. Prior to her career in sports business, she played professional golf at the highest level in the sport, competing on the LPGA tour for three years between 2009 and 2011.

Jeehae is a proud graduate of Phillips Academy in Andover, MA, and has a BA in Economics from Yale and an MBA from The Wharton School at University of Pennsylvania.



Jingwan (Cynthia) Lu, panelist for “The Impact of Generative AI in Different Industries”

Cynthia is a senior director from Adobe leading an applied research organization focusing on developing the Adobe Firefly family of GenAI models built from the ground up.

Her team started training Adobe’s first large-scale foundational model and helped rally together the rest of the company to roll out a new web-based product called Firefly featuring the image generation model as the first step in early 2023.

The same technology and its extension power Adobe Photoshop’s Generative Fill and Generative Expand features giving users intelligent image inpainting and outpainting experience. Time recognizes Adobe Photoshop Generative Fill and Generative Expand as best inventions of 2023 in the AI category.

Before Firefly, Jingwan was a computer vision research scientist and team lead who pioneered and led a large group effort to explore early generative models such as GANs within Adobe.



Wei Xiao, panelist for “The Impact of Generative AI in Different Industries”

Wei is the Director of Developer Relations at NVIDIA for the Middle East, Africa, and emerging regions. Her primary focus is to drive AI and accelerated computing integration within the ecosystem.

Before assuming her current role, Wei Xiao headed Ecosystem Engineering and Evangelism teams at both ARM and Samsung Semiconductor.

In addition to her professional endeavors, Wei dedicates her free time to teaching AI courses at the Graduate School of Computer Science at Santa Clara University.



Priya Mathur, panelist for “The Impact of Generative AI in Different Industries”

Priya is a Staff Data Science Manager at Google and she is the founder of Sparkle – GenAI Data Analyst.

At Google, she leads Data Science for Home Platform Monetization and GenAI efforts for DSPA.

Previously at Groupon, she led Data Science for App Push Notifications and TV Ads.



Katherine Chou, panelist for “The Impact of Generative AI in Different Industries”

Katherine is the Senior Director of Research and Innovations at Google with a specific focus on nurturing scientific and technical breakthroughs that can lead to global impact for science, health, climate, and advancement of platform technologies for our developers and researchers.

Katherine is focused on improving the availability and accuracy of healthcare using machine learning. She is a serial intrapreneur, particularly interested in removing health inequities and improving health and well-being outcomes across all populations.

She previously developed products within Google[x] Labs for Life Sciences (now Verily) and co-founded Medical Brain (now “Health AI'') at Google. She also headed up global teams to develop partner solutions and establish developer ecosystems for Mobile Payments, Mobile Search, GeoCommerce, YouTube, and Android.

Outside of Google, she is a Board member and Program Chair of Lewa Wildlife Conservancy, a Scientific Advisor to the ARCS Foundation, a fellow of the Zoological Society of London, and collaborates with other wildlife NGOs and the Cambridge Business Sustainability Programme in applying the Silicon Valley innovation mindset to new areas.

Katherine holds a double major in Computer Science and Economics at Stanford University and an M.S. in CS specialized in graphics.



Jaimie Hwang, presenter for “Take Action, Learn More, Start Building with Google AI”

Jaimie Hwang is a global product marketing leader with over a decade of experience, specifically in AI/ML.

She has built and led global product marketing teams at a number of AI companies, including an award-winning computer vision startup and tech giant Amazon.

She specializes in executive thought leadership, product storytelling, and integrated GTM strategy. She is passionate about promoting AI technology that is built responsibly and solves real-world problems in a human-centric way.

Jaimie holds a BS in Journalism and Integrated Marketing and Communications from Northwestern University. She lives in Seattle, Washington.


Save your spot at WiML Symposium 2023

The Women in ML Symposium offers sessions for all expertise levels, from beginners to advanced practitioners. RSVP today to secure your spot and explore our comprehensive agenda. We can’t wait to see you there!

Women in ML Symposium 2023: Meet the presenters



Posted by Sharbani Roy – Senior Director, Product Management, Google

We’re back with the third annual Women in Machine Learning Symposium on December 7, 2023

Join us virtually from 9:30 am to 1:00 pm PT for an immersive and insightful set of deep dives for every level of Machine Learning experience.

The Women in ML Symposium is an inclusive event for anyone passionate about the transformative fields of Machine Learning (ML) and Artificial Intelligence (AI). Meet this year’s women in ML as they uncover practical applications across multiple industries and discuss the latest advancements in frameworks, generative AI, and more.


Joana Carrasqueira, presenter for “Enabling Anyone to Build with Google AI”

Joana is a Developer Relations Lead for AI/ML at Google and her mission is to empower individuals and organizations to harness the power of AI to address real-world challenges.

She is a business leader with a track record of bringing strategic vision and global cross-functional programs to life. She’s also the creator of Google’s Women in ML program and flagship symposium, a pioneering initiative that has equipped thousands of developers with knowledge and skills in AI/ML.

Prior to Google, she worked at the Silicon Valley Innovation Center on innovation consulting for Forbes top500, startups and Venture Capital firms. Served as Education Manager at the International Pharmaceutical Federation, working closely with WHO, UNESCO, the United Nations and started her career at the Portuguese Pharmaceutical Society.

Joana holds an MBA from IE Business School, a Master in Pharmaceutical Sciences and a Leadership Certificate from U.C. Berkeley in California.



Sharbani Roy, presenter for “What’s New in Machine Learning?”

Sharbani is Sr. Director in Google’s Core Machine Learning group.

Before joining Google, Sharbani led engineering and product teams in Amazon Alexa, focused on media streaming, real-time communication, and applied ML (e.g., NLU, CV, and AR) for 1P/3P developers and end consumers.

Sharbani holds degrees in physics and mathematics from the University of Chicago and an MBA from Stanford University, and lives in Seattle with her husband and three children.



Eve Phillips, presenter for “Future of Frameworks: Navigate the OSS Landscape"

Eve is a Director of Product Management at Google.

Currently, Eve leads the ML Frameworks product team, which includes responsibility for TensorFlow, JAX and Keras. Previously, she led product teams within Google for Clinicians and ChromeOS. Prior to Google, she served as CEO of Empower Interactive, delivering tech-enabled behavioral health.

Earlier, she held roles in leading technology companies and investors including Trilogy, Microsoft, and Greylock.

Eve earned a BS and M.Eng in EECS from MIT and an MBA from Stanford.



Meenu Gaba, presenter for “Data-Centric AI: A New Paradigm"

Meenu leads the Machine Learning infrastructure team at Google, with a mission to power AI innovation with world-class ML infrastructure and services.

She is a technology leader with years of experience launching new products and growing small teams into mature scalable, multi-tiered organizations that are poised to deliver high quality products. Meenu enjoys fast-paced, dynamic, highly iterative/innovative environments and has lots of experience in balancing these disciplines while fostering a people-first culture and forming solid grounds for cross-functional relationships.

Meenu holds a Master's degree in Computer Science. In her free time, she enjoys hiking, solving crosswords, and watching movies.



Kelly Shaefer, presenter for “Maximize Your Data Exploration”

Kelly leads product teams at Google Labs, building both entirely new AI products and AI-enabled features into Google's largest existing products.

In the past, she led the Growth team for Google Workspace, including Gmail, Drive, Docs, and many more.

Outside of Google, she led the Enterprise product team at Stripe and was the P&L owner for Stripe's multi-billion dollar Payments area.

Kelly has an undergraduate degree from Wharton at UPenn, and an MBA from Harvard Business School.



Divyashree Sreepathihalli, presenter for “Keras: Shortcut to AI Mastery”

Divya is a talented machine learning software engineer who is currently a part of the Keras team at Google.

In this role, she specializes in developing Keras core modeling APIs and KerasCV to improve the functionality of the software.

Prior to joining Google, Divya worked as a Deep Learning Scientist for Zazu Sensor, a startup group in Intel's Emerging Growth Incubation (EGI) group. Her work there focused on computer vision and deep learning algorithm development for object detection and tracking, resulting in significant advancements for the startup.

Divya completed her Masters in Computer Engineering from Texas A&M University where she focused on Artificial intelligence in 2017.



Na Li, presenter for “Prototype ML with Visual Blocks”

Na Li is a software engineer manager from Google CoreML.

She leads a team to build developer tools to support ML development journey, from prototyping to model visualization and benchmarking.

Prior to Google, she was a research scientist at Harvard, working in HCI domain.

Throughout her career, Na strives to make ML accessible for everyone.



Zoe Wang, presenter for “Deploying ML Models to Mobile Devices”

Zoe is a technical program manager at Google.

Her career has been focused on Machine Learning (ML) productionization.

Currently she works with her team bringing ML models to mobile devices that power some of AI features for Pixel and other edge devices.

Prior to Google, Zoe worked at Meta on ML Platforms for end-to-end ML lifecycles.



Yvonne Li, presenter for “New GenAI Products and Solutions on Google Cloud”

Yvonne Li is a software engineer on the Duet Platform team at Google, where she focuses on improving the quality of generative AI models.

As a machine learning engineer and developer advocate at IBM, she designed and developed language models and curated open source datasets.

She has over 3 years of experience in the big tech industry, and is passionate about using machine learning to solve real-world problems.

Yvonne is the author of two Coursera courses: Data Analysis with R, and, Data Visualization with R.



Nithya Natesan, presenter for “AI-powered Infrastructure: Cloud TPUs”

Nithya Natesan is a Group Product Manager in the Cloud ML Accelerators team focussing on GPU / TPU offerings for Google Cloud.

Prior to Google, she was head of product management at NVIDIA, launching several products like DGX Cloud, Base Command Platform.

She has ~14 years of experience in hyper convergence Data Center software products, with recent focus on ML / AI Infra and Platform products. She is passionate about building rock solid PM teams, and shipping high quality usable ML / AI products.

Nithya has also won industry accolades namely WomenImpactTech 2023.



Andrada Vulpe, presenter for “Community Matters: 8 Reasons Why You Should Be Involved with Kaggle”

Andrada is a Data Scientist at Endava, a Notebooks Grandmaster on Kaggle, a Dev Expert at Weights and Biases and a proud Z by HP Data Science Global Ambassador.

She is highly passionate about Python, R, Machine and Deep Learning, powerful visualizations and everything in between.

Andrada finished her MSc in Data Science and Analytics in the UK and won 2 Kaggle Analytics competitions.



Jeehae Lee, presenter for “From Recovering Pro Golfer to AI Entrepreneur”

Jeehae Lee is a golf industry executive who has worked to create and build transformational sports technology businesses.

As the Co-Founder & CEO of Sportsbox AI, Jeehae is currently developing products using AI-enabled 3D motion analysis technology that will help participants of various sports and fitness activities learn and improve their skills.

Before founding Sportsbox, she spent five years between 2015 and 2020 at Topgolf Entertainment Group, leading strategy and new business development for various divisions including Toptracer. Between 2012 and 2013, she was at global sports and entertainment marketing agency, IMG, representing professional golfer icon Michelle Wie West. Prior to her career in sports business, she played professional golf at the highest level in the sport, competing on the LPGA tour for three years between 2009 and 2011.

Jeehae is a proud graduate of Phillips Academy in Andover, MA, and has a BA in Economics from Yale and an MBA from The Wharton School at University of Pennsylvania.



Jingwan (Cynthia) Lu, panelist for “The Impact of Generative AI in Different Industries”

Cynthia is a senior director from Adobe leading an applied research organization focusing on developing the Adobe Firefly family of GenAI models built from the ground up.

Her team started training Adobe’s first large-scale foundational model and helped rally together the rest of the company to roll out a new web-based product called Firefly featuring the image generation model as the first step in early 2023.

The same technology and its extension power Adobe Photoshop’s Generative Fill and Generative Expand features giving users intelligent image inpainting and outpainting experience. Time recognizes Adobe Photoshop Generative Fill and Generative Expand as best inventions of 2023 in the AI category.

Before Firefly, Jingwan was a computer vision research scientist and team lead who pioneered and led a large group effort to explore early generative models such as GANs within Adobe.



Wei Xiao, panelist for “The Impact of Generative AI in Different Industries”

Wei is the Director of Developer Relations at NVIDIA for the Middle East, Africa, and emerging regions. Her primary focus is to drive AI and accelerated computing integration within the ecosystem.

Before assuming her current role, Wei Xiao headed Ecosystem Engineering and Evangelism teams at both ARM and Samsung Semiconductor.

In addition to her professional endeavors, Wei dedicates her free time to teaching AI courses at the Graduate School of Computer Science at Santa Clara University.



Priya Mathur, panelist for “The Impact of Generative AI in Different Industries”

Priya is a Staff Data Science Manager at Google and she is the founder of Sparkle – GenAI Data Analyst.

At Google, she leads Data Science for Home Platform Monetization and GenAI efforts for DSPA.

Previously at Groupon, she led Data Science for App Push Notifications and TV Ads.



Katherine Chou, panelist for “The Impact of Generative AI in Different Industries”

Katherine is the Senior Director of Research and Innovations at Google with a specific focus on nurturing scientific and technical breakthroughs that can lead to global impact for science, health, climate, and advancement of platform technologies for our developers and researchers.

Katherine is focused on improving the availability and accuracy of healthcare using machine learning. She is a serial intrapreneur, particularly interested in removing health inequities and improving health and well-being outcomes across all populations.

She previously developed products within Google[x] Labs for Life Sciences (now Verily) and co-founded Medical Brain (now “Health AI'') at Google. She also headed up global teams to develop partner solutions and establish developer ecosystems for Mobile Payments, Mobile Search, GeoCommerce, YouTube, and Android.

Outside of Google, she is a Board member and Program Chair of Lewa Wildlife Conservancy, a Scientific Advisor to the ARCS Foundation, a fellow of the Zoological Society of London, and collaborates with other wildlife NGOs and the Cambridge Business Sustainability Programme in applying the Silicon Valley innovation mindset to new areas.

Katherine holds a double major in Computer Science and Economics at Stanford University and an M.S. in CS specialized in graphics.



Jaimie Hwang, presenter for “Take Action, Learn More, Start Building with Google AI”

Jaimie Hwang is a global product marketing leader with over a decade of experience, specifically in AI/ML.

She has built and led global product marketing teams at a number of AI companies, including an award-winning computer vision startup and tech giant Amazon.

She specializes in executive thought leadership, product storytelling, and integrated GTM strategy. She is passionate about promoting AI technology that is built responsibly and solves real-world problems in a human-centric way.

Jaimie holds a BS in Journalism and Integrated Marketing and Communications from Northwestern University. She lives in Seattle, Washington.


Save your spot at WiML Symposium 2023

The Women in ML Symposium offers sessions for all expertise levels, from beginners to advanced practitioners. RSVP today to secure your spot and explore our comprehensive agenda. We can’t wait to see you there!

See what’s new & what’s possible with Firebase at Demo Day

Posted by Annum Munir – Product Marketing Manager

This article also appears on the Firebase blog

After sharing tons of teasers and behind-the-scenes footage over the past few weeks, we’re excited to announce that our very first Demo Day is finally here! Today, we released 10 demos (i.e. pre-recorded short videos) alongside technical resources to show you what’s new, what’s possible, and how you can solve your biggest app development challenges with Firebase. You don’t want to miss this peek at the future of Firebase!

Tune in from anywhere, at any time to check out the demos at your own pace.


Build, run and scale your apps with Firebase

The demos are designed to help you build and run full stack apps faster, harness the power of AI to work smarter and build engaging experiences, and use Google technology and tools together to be more productive.

AI demos

AI has everyone buzzing, but are you wondering how to practically use it in your app development workflow? You won’t want to miss these demos:


Flutter and Project IDX demos

We also worked closely with our friends across Google, including Flutter and Project IDX (to name a few) to demo integrated solutions from your favorite Google products so you get a seamless development experience. Check them out:


App development demos

And last but not least, we’re committed to helping you improve all parts of app development. Watch these demos on strengthening app security, releasing safely and reducing risk, and automating and scaling your infrastructure. We’ve even added new quality-of-life features and given the Firebase console a highly-requested makeover that’ll take you to the dark side.


Happy Demo Day!

Check out the demos and then join the conversation on X (formerly Twitter) and LinkedIn using #FirebaseDemoDay to ask questions, give us feedback, and see what the rest of the community is saying.

Full-stack development in Project IDX

Posted by Kaushik Sathupadi, Prakhar Srivastav, and Kristin Bi – Software Engineers; Alex Geboff – Technical Writer

We launched Project IDX, our experimental, new browser-based development experience, to simplify the chaos of building full-stack apps and streamline the development process from (back)end to (front)end.

In our experience, most web applications are built with at-least two different layers: a frontend (UI) layer and a backend layer. When you think about the kind of app you’d build in a browser-based developer workspace, you might not immediately jump to full-stack apps with robust, fully functional backends. Developing a backend in a web-based environment can get clunky and costly very quickly. Between different authentication setups for development and production environments, secure communication between backend and frontend, and the complexity of setting up a fully self-contained (hermetic) testing environment, costs and inconveniences can add up.

We know a lot of you are excited to try IDX yourselves, but in the meantime, we wanted to share this post about full-stack development in Project IDX. We’ll untangle some of the complex situations you might hit as a developer building both your frontend and backend layers in a web-based workspace — developer authentication, frontend-backend communication, and hermetic testing — and how we’ve tried to make it all just a little bit easier. And of course we want to hear from you about what else we should build that would make full-stack development easier for you!


Streamlined app previews

First and foremost, we've streamlined the process of enabling your applications frontend communication with its backend services in the VM, making it effortless to preview your full-stack application in the browser.

IDX workspaces are built on Google Cloud Workstations and securely access connected services through Service Accounts. Each workspace’s unique service account supports seamless, authenticated preview environments for your applications frontend. So, when you use Project IDX, application previews are built directly into your workspace, and you don’t actually have to set up a different authentication path to preview your UI. Currently, IDX only supports web previews, but Android and iOS application previews are coming soon to IDX workspaces near you.

Additionally, if your setup necessitates communication with the backend API under development in IDX from outside the browser preview, we've established a few mechanisms to temporarily provide access to the ports hosting these API backends.


Simple front-to-backend communication

If you’re using a framework that serves both the backend and frontend layers from the same port, you can pass the $PORT flag to use a custom PORT environment variable in your workspace configuration file (powered by Nix and stored directly in your workspace). This is part of the basic setup flow in Project IDX, so you don’t have to do anything particularly special (outside of setting the variable in your config file). Here’s an example Nix-based configuration file:


{ pkgs, ... }: {

# NOTE: This is an excerpt of a complete Nix configuration example.

# Enable previews and customize configuration
idx.previews = {
  enable = true;
  previews = [
    {
      command = [
        "npm"
        "run"
        "start"
        "--"
        "--port"
        "$PORT"
        "--host"
        "0.0.0.0"
        "--disable-host-check"
      ];
      manager = "web";
      id = "web";
    }
  ];
};

However, if your backend server is running on a different port from your UI server, you’ll need to implement a different strategy. One method is to have the frontend proxy the backend, as you would with Vite's custom server options.

Another way to establish communication between ports is to set up your code so the javascript running on your UI can communicate with the backend server using AJAX requests.

Let’s start with some sample code that includes both a backend and a frontend. Here’s a backend server written in Express.js:


import express from "express";
import cors from "cors";


const app= express();
app.use(cors());

app.get("/", (req, res) => {
    res.send("Hello World");
});

app.listen(6000, () => {
    console.log("Server is running on port 6000");
})

The bolded line in the sample — app.use(cors()); — sets up the CORS headers. Setup might be different based on the language/framework of your choice, but your backend needs to return these headers whether you’re developing locally or on IDX.

When you run the server in the IDX terminal, the backend ports show up in the IDX panel. And every port that your server runs on is automatically mapped to a URL you can call.

Moving text showing the IDX terminal and panel

Now, let's write some client code to make an AJAX call to this server.


// This URL is copied from the side panel showing the backend ports view
const WORKSPACE_URL = "https://6000-monospace-ksat-web-prod-79679-1677177068249.cluster-lknrrkkitbcdsvoir6wqg4mwt6.cloudworkstations.dev/";

async function get(url) {
  const response = await fetch(url, {
    credentials: 'include',
  });
  console.log(response.text());
}

// Call the backend
get(WORKSPACE_URL);

We’ve also made sure that the fetch() call includes credentials. IDX URLs are authenticated, so we need to include credentials. This way, the AJAX call includes the cookies to authenticate against our servers.

If you’re using XMLHttpRequest instead of fetch, you can set the “withCredentials” property, like this:


const xhr = new XMLHttpRequest();
xhr.open("GET", WORKSPACE_URL, true);
xhr.withCredentials = true;
xhr.send(null);

Your code might differ from our samples based on the client library you use to make the AJAX calls. If it does, check the documentation for your specific client library on how to make a credentialed request. Just be sure to make a credentialed request.


Server-side testing without a login

In some cases you might want to access your application on Project IDX without logging into your Google account — or from an environment where you can’t log into your Google account. For example, if you want to access an API you're developing in IDX using either Postman or cURL from your personal laptops's command line. You can do this by using a temporary access token generated by Project IDX.

Once you have a server running in Project IDX, you can bring up the command menu to generate an access token. This access token is a short-lived token that temporarily allows you to access your workstation.

It’s extremely important to note that this access token provides access to your entire IDX workspace, including but not limited to your application in preview, so you shouldn’t share it with just anyone. We recommend that you only use it for testing.

Generate access token in Project IDX

When you run this command from IDX, your access token shows up in a dialog window. Copy the access token and use it to make a cURL request to a service running on your workstation, like this one:


$ export ACCESS_TOKEN=myaccesstoken
$ curl -H "Authorization: Bearer $ACCESS_TOKEN" https://6000-monospace-ksat-web-prod-79679-1677177068249.cluster-lknrrkkitbcdsvoir6wqg4mwt6.cloudworkstations.dev/
Hello world

And now you can run tests from an authenticated server environment!


Web-based, fully hermetic testing

As we’ve highlighted, you can test your application’s frontend and backend in a fully self-contained, authenticated, secure environment using IDX. You can also run local emulators in your web-based development environment to test your application’s backend services.

For example, you can run the Firebase Local Emulator Suite directly from your IDX workspace. To install the emulator suite, you’d run firebase init emulators from the IDX Terminal tab and follow the steps to configure which emulators you want on what ports.

ALT TEXT

Once you’ve installed them, you can configure and use them the same way you would in a local development environment from the IDX terminal.


Next Steps

As you can see, Project IDX can meet many of your full-stack development needs — from frontend to backend and every emulator in between.

If you're already using Project IDX, tag us on social with #projectidx to let us know how Project IDX has helped you with your full-stack development. Or to sign up for the waitlist, visit idx.dev.

Aligning the user experience across surfaces for Google Pay

Posted by Dominik Mengelt – Developer Relations Engineer

During the last months we've been working hard to align the Google Pay user experience across Web and Android. We are committed to advancing all Google Pay surfaces progressively, and creating a more cohesive experience for your users. In addition, the Google Pay sheets for Android and Chrome on Android now use the latest Material 3 design system with Web to follow in early 2024.

UX improvements on Android

Aligning the bottom sheets on Android and Chrome for Android (Mobile Web) led to a ~2.5% increase in conversion rate and a ~39% reduction in errors for users using Google Pay with Chrome on Android[1].

Side by side photos of Gogle Pay sheet on Android and Mobile Web
Figure 1: The identical Google Pay bottom sheets for Android (left) and Chrome on Android (right)


A completely revamped Google Pay sheet on the Web

On the web we aligned the user experience to be the same as on Android. Additionally we gave the Payment Handler window a more minimalistic look. With these changes we are seeing a conversion rate increase of ~9%.[1]

Google Pay displayed inside the new minimalistic Payment Handler window
Figure 2: Google Pay displayed inside the new minimalistic Payment Handler window


No changes required!

Whether you are a merchant integrating Google Pay on your own or through a PSP, you don’t need to make any changes. We've already rolled out these changes to most of our users. This means that your users are likely already benefiting from the new experience or will very soon. For certain features, for example dynamic price updates, Google Pay will temporarily show the previous user experience. We are actively working on migrating all features to benefit from the new updated design.


Getting started with Google Pay

Not yet using Google Pay? Take a look at the documentation to start integrating Google Pay today. Learn more about the integration by taking a look at our sample application for Android on GitHub or use one of our button components for your web integration. When you are ready, head over to the Google Pay & Wallet console and submit your integration for production access.

Follow @GooglePayDevs on X (formerly Twitter) for future updates. If you have questions, tag @GooglePayDevs and include #AskGooglePayDevs in your tweets.


[1] internal Google study

#WeArePlay | Meet Geraldo from Utah. More stories from around the world.

Posted by Leticia Lago, Developer Marketing

Another month, another series of #WeArePlay stories from apps and games we all love. From a Salt Lake City-based music editing app to successful game studios from Indonesia, Uruguay and Türkiye - discover the inspiring founders behind them.

This time we’re starting in the US with Geraldo. Inspired by his mom’s studies in computer engineering, he decided to start his own tech company at just 16 years of age. But he was also a keen musician and merged both his passions in Moises, alongside childhood friend and co-founder Eddie. The app uses artificial intelligence to remove vocals and instruments from any song. Geraldo describes the process as like “getting a smoothie and removing only the banana” – complex, to say the least, but Moises makes it easy. He hopes “to democratize access to cutting edge audio tools for everyday musicians.”

#WeArePlay Diori & Agung MINIMO South Tangerang Indonesia g.co/play/weareplay Google Play

Next up, we’re crossing the Pacific over to Indonesia where colleagues and game enthusiasts Diori and Agung decided to collaborate outside of the office on their own independent project. This culminated in the launch of their studio, Minimo, with their most successful game, Mini Racing Adventures, accumulating over 38 million downloads to date. The pair channeled Agung’s passion for cars and mechanics into this particular release, but next they’re shifting genres and working on a new shooter game.

#WeArePlay Pablo & Gonzalo Ironhide Game Studio Montevideo Uruguay g.co/play/weareplay Google Play

Now we’re heading down to Uruguay where friends Pablo, Gonzalo and Alvaro had a dream of making games for a living and created Ironhide Game Studio in 2010, learning how to code for mobiles from scratch. As Pablo puts it: “Over the years we’ve realized that what we have is special, because we have the passion, but we also work really hard. This has allowed us to create something great.” Their popular title, Kingdom Rush: Tower Defence, is a strategy game set in a medieval settlement and chock-full of action-filled battles. Looking to the future, they’re hoping to branch into multiplayer games and expand their Kingdom Rush saga.

#WeArePlay Remi, Mithat, Rina, Fuat & Barkin SPYKE GAMES Instanbul Türkiye g.co/play/weareplay Google Play

And finally we’re crossing over to Europe to meet Rina. While working in private equity and meeting an array of business heads, she was inspired to pursue an entrepreneurial path herself. Seeing how popular gaming was becoming, Rina delved into creating titles for a Turkish audience. She struck gold with her first studio becoming a tech unicorn, and soon followed it up with Spyke Games, launched alongside her brother Remi and friends Mithat, Barkin and Fuat. Their title Tile Busters combine social multiplayer fun and skill-based puzzle solving. Soon, they’re releasing a follow-up, Blitz Busters, keeping their goal of being “great content developers creating games that people crave more of.”

Discover more global #WeArePlay stories and share your favorites.



How useful did you find this blog post?

Cybersecurity Awareness Month: Web GDE Shrutirupa Banerjiee shares how we can stay safe in a world of cyber attacks

Posted by Kevin Hernandez, Developer Relations Community Manager

For Cybersecurity Awareness Month, we are celebrating Shrutirupa Banerjiee, Web GDE.

The web can be an excellent tool to learn a new skill, connect with people all over the world, digest information, or use new technologies such as Google Bard. However, there can be threats that loom whenever you go online - malware and social engineering attacks (also known as phishing) are some examples of today’s cyber attacks that can steal data or gain access to your system. Luckily, we have people like Shrutirupa Banerjiee, Web GDE, working on ways to keep companies and individuals safe from these threats. Shrutirupa got her start in the field by way of Blockchain security and eventually transitioned into a malware research role as a Senior Security Researcher. As an advocate for cybersecurity, Shrutirupa shares what threats are facing us today and what steps we can take to keep ourselves safe.

Headshot of Shrutirupa Banerjiee, smiling
Shrutirupa Banerjiee, Google Developer Expert, Web

Threats facing the web today

Shrutirupa is mostly concerned with malware and especially those that bide their time and sit in your systems for months and sometimes even for years. She describes, “These attacks get into your system and sit there for months while you’re unable to identify that there is any kind of malware or malicious program. When you’ve already gained trust, it will start connecting with the malicious servers.” A recent example of this type of attack was the SolarWinds attack. With this case, hackers were able to gain unauthorized access to the SolarWinds network in 2019, injected their malicious code into the SolarWinds software in February of 2020, and in March of 2020 SolarWinds unknowingly pushed a software update which included the malicious code (source: NPR). This malware was downloaded by 18,000 customers, compromising the data of companies and government agencies which included the Treasury and the Pentagon.

One mission that Shrutirupa is actively working towards is bridging the gap between developers and cybersecurity professionals. By being made aware of the new threats that face the web, developers can actively safeguard companies and consumers from malware attacks.


How you can stay safe online

Shrutirupa recommends the following to help prevent data breaches or attacks on our systems:

  • Think before you click: Be aware of anything that you’re downloading and do your proper research on the source of the software. You can see what others are saying about the software on community sites like Quora or Reddit.
  • Software updates: Make sure your software updates are up-to-date since these contain patches.
  • Antivirus scanners: There are many free options that you can add to your computer and use to run regular scans to ensure that your system is running safely.
  • Multi-Factor Authentication (MFA): This method allows you to have an extra layer of security when logging onto sites and notifies you of unauthorized logins.
  • Developers and researchers working together: Consult cybersecurity professionals like Shrutirupa! Cybersecurity professionals are on top of new threats and can make you aware of what you should prioritize as a developer.

Shrutirupa has also been able to leverage Google technologies and the GDE program in order to educate and test in a safe environment. “Google provides me with an environment where I can mentor students who want to get into cybersecurity or want to do development in a secure way. Google also has resources like Google Cloud Platform where you can practice and test everything while learning,” she says.

With people of all ages accessing the internet, Cybersecurity Awareness Month is crucial and Shrutirupa believes that awareness of threats should be a regular occurrence due to the importance of it. She states, “We have children, parents, and grandparents all accessing the internet. Because of this, we should always be vigilant about what we’re downloading and understand previous and new cybersecurity attacks so we can prevent them.”

You can find Shrutirupa on LinkedIn, GitHub, Twitter, and YouTube.


The Google Developer Experts (GDE) program is a global network of highly experienced technology experts, influencers, and thought leaders who actively support developers, companies, and tech communities by speaking at events and publishing content.

All treats, no tricks: 6 solutions to common developers challenges

Posted by Google for Developers

For many, Halloween is the perfect excuse to dress up and celebrate the things that haunt us. Google for Developers is embracing the spirit of the season by diving into the spine-chilling challenges that spook software developers and engineers. Read on to uncover these lurking terrors and discover the tricks – and treats – to conquer them.


The code cemetery

Resilient code requires regular updates, and when it comes to solving bugs, it’s much easier to find them when there are fewer lines of code. When faced with legacy or lengthy code, consider simplifying and refreshing it to make it more manageable – because no one likes an ancient or overly complex codebase. Here are some best practices.

Start small: Don't try to update your entire codebase at once. Instead, start by updating small, isolated parts of the codebase to minimize the risk of introducing new bugs.

Use a version control system: Track your changes and easily revert to a previous version if necessary.

Consider a refactoring tool: This can help you to make changes to your code without breaking it.

Test thoroughly: Make sure to test your changes thoroughly before deploying them to production. This includes testing the changes in isolation, as well as testing them in conjunction with the rest of the codebase. See more tips about testing motivation below.

Document your changes: Include new tooling, updated APIs, and any changes so other developers understand what you have done and why.


Testing terrors

When you want to build and ship quickly, it’s tempting to avoid writing tests for your code because they might slow you down in the short term. But beware, untested code will come back to haunt you later. Testing is a best practice that can save you time, money, and angst in the long run. Even if you know you should run tests, it doesn’t mean you want to. Use these tips to help make writing tests easier.

Test gamification: Turn test writing into a game. Challenge yourself to write tests faster than your coworker can say "code coverage."

Pair programming: Write tests together with a colleague. It's like having a workout buddy – more fun and motivating.

Set up test automation: Automate tests wherever possible– it's better AND more efficient.


A monster problem: not being able to choose your tech stack

Many developers have strong preferences when it comes to products, but sometimes legacy technology or organizational needs can limit choices. This can be deflating, especially if it prevents you from using the latest tools. If you’re faced with a similar situation, it’s worth expressing your recommendations to your team. Here’s how:

Lobby for change: If the current tech stack really isn't working out, advocate for a change. This may require documentation over a series of events, but you can use that to build your case.

Pitch the benefits: If you’re ready to share your preferences, explain how your tech stack of choice benefits the project, similarly to how optimized code improves performance.

Showcase expertise: Demonstrate your knowledge in your preferred stack, whether it’s through a Proof of Concept or a presentation.

Upskill: If you have to dive into a top-down tech stack that you are not familiar with, consider it a learning opportunity. It’s like exploring a new coding language.

Compromise is key: First, recognize that all of the points above are still well-worth aiming for, but sometimes, you do have to compromise. Think of it as working with legacy code - not ideal, but doable. So if you aren’t able to influence in your favor, don’t be dismayed.


Not a trick: ship your code smarter

The only thing worse than spending the end of the week fixing buggy code isexcept for spending the weekend fixing buggy code when you had other plans. Between less time to react to problems, taking up personal time, and fewer people available to help troubleshoot – shipping code when you don’t have the proper resources in place to help is risky at best. Here are a handful of best practices to help you build a better schedule and avoid the Saturday and Sunday Scaries.

Consider business hours and user impact: Schedule deployments during off-peak times when fewer users will be impacted. For B2B companies, Friday afternoons can minimize disruption for customers, but for smaller companies, Friday deployments might mean spending your weekend fixing critical issues. Pick a schedule that works for you.

Automate testing: Implement automated testing in your development process to catch issues early.

Make sure your staging environment is right: Thoroughly test changes in a staging environment that mirrors production.

Be rollback-ready: Have a rollback plan ready to revert quickly if problems arise.

Monitoring and alerts: Set up monitoring and alerts to catch issues 24/7.

Communication: Ensure clear communication among team members regarding deployment schedules and procedures.

Scheduled deployments: If you’re a team who doesn’t regularly ship at the end of the week, consider READ-ONLY Fridays. Or if necessary, schedule Friday deployments for the morning or early afternoon.

Weekend on-call: Consider a weekend on-call rotation to address critical issues.

Post-deployment review: Analyze and learn from each deployment's challenges to improve processes.

Plan thoroughly: Ensure deployment processes are well-documented and communication is clear across teams and stakeholders.

Evaluate risks: Assess potential business and user impact to determine deployment frequency and timing.


A nightmare come true: getting hacked

Realizing you've been hacked is a heart-stopping event, but even the most tech-savvy developers are vulnerable to attacks. Before it happens to you, remember to implement these best practices.

Keep your systems and software up-to-date: Think of it as patching vulnerabilities in your code.

Use strong passwords: Just like strong encryption, use robust passwords.

Use two-factor authentication: Always add a second layer of security.

Beware of phishermen: Don't take the bait. Be as cautious with suspicious emails as you are with untested code.

Perform security audits: Regularly audit your systems for vulnerabilities, like running code reviews but for your cybersecurity.

Backup plan: Just like version control, maintain backups. They're your safety net in case things go full horror-movie.


The horror: third party data breaches

Data breaches are arguably the most terrifying yet plausible threat to developer happiness. No company wants to be associated with them, let alone the dev who chose the service or API to work with. Here are some tips for minimizing issues with third party vendors to help you avoid this scenario.

Perform due diligence on third-party vendors: Before working with a third-party vendor, carefully review their security practices and policies. Ask about security certifications, vulnerability management practices, and their incident response plan.

Require vendors to comply with security requirements: Create or add your input in a written contract with each third-party vendor that outlines the security requirements that the vendor must meet. This contract should include requirements for data encryption, access control, and incident reporting.

Monitor vendor activity: Ensure vendors comply with the security requirements in the contract by reviewing audit logs and conducting security assessments. Only grant access to data that a vendor needs to perform their job duties to help to minimize the impact of a data breach if the vendor is compromised.

Implement strong security controls: Within your own systems, protect data from unauthorized access through firewalls, intrusion detection systems, and data encryption.

Be wary of third-party APIs: Vet all security risks. Carefully review the API documentation to understand the permissions that are required and to ensure the API uses strong security practices.

Use secure coding practices: Use input validation, escaping output, and strong cryptography.

Keep software up to date: Always update with the latest security patch to help to protect against known vulnerabilities.


Creepin' it real

It’s easy to get spooked knowing what can go wrong, but by implementing these best practices, the chance of your work going awry goes down significantly.

What other spine-chilling developer challenges have you experienced? Share them with the community.