Author Archives:

Gemma 4: Expanding the Gemmaverse with Apache 2.0

Gemma 4: Expanding the Gemmaverse with Apache 2.0

For over 20 years, Google has maintained an unwavering commitment to the open-source community. Our belief has been simple: open technology is good for our company, good for our users, and good for our world. This commitment to fostering collaborative learning and rigorous testing has consistently proven more effective than pursuing isolated improvements. It's been our approach ever since the 2005 launch of Google Summer of Code, and through our open-sourcing of Kubernetes, Android, and Go, and it remains central to our ongoing, daily work alongside maintainers and organizations.

Today, we are taking a significant step forward in that journey. Since first launch, the community has downloaded Gemma models over 400 million times and built a vibrant universe of over 100,000 inspiring variants, known in the community as the Gemmaverse.

The release of Gemma 4 under the Apache 2.0 license — our most capable open models ranging from edge devices to 31B parameters — provides cutting-edge AI models for this community of developers. The industry-standard Apache license broadens the horizon for Gemma 4's applicability and usefulness, providing well-understood terms for modification, reuse, and further development.

A long legacy of open research

We are committed to making helpful, accessible AI technology and research so that everyone can innovate and grow. That's why many of our innovations are freely available, easy to deploy, and useful to developers across the globe. We have a long history of making our foundational machine-learning research, including word2vec, Jax, and the seminal Transformers paper, publicly available for anyone to use and study.

We accelerated this commitment last year. By sharing models that interpret complex genomic data and identify tumor variants, we contributed to the "magic cycle" of research breakthroughs that translate into real-world impact. This week, however, marks a pivotal moment — Gemma 4 models are the first in the Gemmaverse to be released under the OSI-approved Apache 2.0 license.

Empowering developers and researchers to deliver breakthrough innovations

Since we first launched Gemma in 2024, the community of early adopters has grown into a vast ecosystem of builders, researchers, and problem solvers. Gemma is already supporting sovereign digital infrastructure, from automating state licensing in Ukraine to scaling Project Navarasa across India's 22 official languages. And we know that developers need autonomy, control, and clarity in licensing for further AI innovation to reach its full potential.

Gemma 4 brings three essential elements of free and open-source software directly to the community:

  • Autonomy: By letting people build on and modify the Gemma 4 models, we are empowering researchers and developers with the freedom to advance their own breakthrough innovations however they see fit.
  • Control: We understand that many developers require precise control over their development and deployment environments. Gemma 4 allows for local, private execution that doesn't rely on cloud-only infrastructure.
  • Clarity: By applying the industry-standard Apache 2.0 license terms, we are providing clarity about developers' rights and responsibilities so that they can build freely and confidently from the ground up without the need to navigate prescriptive terms of service.

Building together to drive real-world impact

Gemma 4, as a release, is an invitation. Whether you are a scientific researcher exploring the language of dolphins, an industry developer building the next generation of open AI agents, or a public institution looking to provide more effective, efficient, and localized services to your citizens, Google is excited to continue building with you. The Gemmaverse is your playground, and with Apache 2.0, the possibilities are more boundless than ever.

We can't wait to see what you build.

Google Workspace’s continuous approach to mitigating indirect prompt injections


Indirect prompt injection (IPI) is an evolving threat vector targeting users of complex AI applications with multiple data sources, such as Workspace with Gemini. This technique enables the attacker to influence the behavior of an LLM by injecting malicious instructions into the data or tools used by the LLM as it completes the user’s query. This may even be possible without any input directly from the user.


IPI is not the kind of technical problem you “solve” and move on. Sophisticated LLMs with increasing use of agentic automation combined with a wide range of content create an ultra-dynamic and evolving playground for adversarial attacks. That’s why Google takes a sophisticated and comprehensive approach to these attacks. We’re continuously improving LLM resistance to IPI attacks and launching AI application capabilities with ever-improving defenses. Staying ahead of the latest indirect prompt injection attacks is critical to our mission of securing Workspace with Gemini. 


In our previous blog “Mitigating prompt injection attacks with a layered defense strategy”, we reviewed the layered architecture of our IPI defenses. In this blog, we’ll share more detail on the continuous approach we take to improve these defenses and to solve for new attacks.

New attack discovery

By proactively discovering and cataloging new attack vectors through internal and external programs, we can identify vulnerabilities and deploy robust defenses ahead of adversarial activity. 

Human Red-Teaming

Human Red-Teaming uses adversarial simulations to uncover security and safety vulnerabilities. Specialized teams execute attacks based on realistic user profiles to exploit weaknesses, coordinating with product teams to resolve identified issues.

Automated Red-Teaming

Automated Red-Teaming is done via dynamic, machine-learning-driven frameworks to stress-test environments. By algorithmically generating and iterating on attack payloads, we can mimic the behavior of sophisticated threats at scale. This allows us to map complex attack paths and validate the effectiveness of our security controls across a much wider range of edge cases than manual testing could achieve on its own.

Google AI Vulnerability Rewards Program (VRP)

The Google AI Vulnerability Rewards Program (VRP) is a critical tool for enabling collaboration between Google and external security researchers who discover new attacks leveraging IPI. Through this VRP, we recognize and reward contributors for their research.  We also host regular, live hacking events where we provide invited researchers access to pre-release features, proactively uncovering novel vulnerabilities. These partnerships enable Google to quickly validate, reproduce, and resolve externally-discovered issues.

Publicly disclosed AI attacks 

Google utilizes open-source intelligence feeds to stay on top of the latest publicly disclosed IPI attacks, across social media, press releases, blogs, and more. From there, new AI vulnerabilities are sourced, reproduced, and catalogued internally to ensure our products are not impacted. 

Vulnerability catalog 

All newly discovered vulnerabilities go through a comprehensive analysis process performed by the Google Trust, Security, & Safety teams. Each new vulnerability is reproduced, checked for duplications, mapped into attack technique / impact category, and assigned to relevant owners. The combination of new attack discovery sources and vulnerability catalog process helps Google stay on top of the latest attacks in an actionable manner. 


Synthetic data generation 

After we discover, curate, and catalog new attacks, we use Simula to generate synthetic data expanding these new attacks. This process is essential because it allows the team to develop attack variants for completeness and coverage, and to prepare new training and validation data sets. This accelerated workflow has boosted synthetic data generation by 75%, supporting large-scale defense model evaluation and retraining, as well as updating the data set used for calculating and reporting on defense effectiveness.


Ongoing defense refinement 

Continually updating and enhancing our defense mechanisms allows us to address a broader range of attack techniques, effectively reducing the overall attack surface. Updating each defense type requires different tasks, from config updates, to prompt engineering and ML model retraining. 

Deterministic Defenses

Deterministic defenses, including user confirmation, URL sanitization, and tool chaining policies, are designed for rapid response against new or emerging prompt injection attacks by relying on simple configuration updates. These defenses are governed by a centralized Policy Engine, with configurations for policies like baseline tool calls, URL sanitization, and tool chaining. For immediate threats, this configuration-based system facilitates a streamlined process for "point fixes," such as regex takedowns, providing an agile defense layer that acts faster than traditional ML/LLM model refresh cycles.

ML-Based Defenses

After generating synthetic data that expands new attacks into variants, the next step is to retrain our ML-based defenses to mitigate these new attacks. We partition the synthetic data described above into separate training and validation sets to ensure performance is evaluated against held-out examples. This approach ensures repeatability, data consistency for fixed training/testing, and establishes a scalable architecture to support future extensions towards fully automated model refresh.

LLM-Based Defenses

Using the new synthetic data examples, our LLM-based defenses go through prompt engineering with refined system instructions. The goal is to iteratively optimize these prompts against agreed-upon defense effectiveness metrics, ensuring the models remain resilient against evolving threat vectors.

Gemini Model Hardening 

Beyond system-level guardrails and application-level defenses, we prioritize ‘model hardening’, a process that improves the Gemini model's internal capability to identify and ignore harmful instructions within data. By utilizing synthetic datasets and fresh attack patterns, we can model various threat iterations. This enables us to strengthen the Gemini model's ability to disregard harmful embedded commands while following the user's intended request. Through this process of model hardening, Gemini has become significantly more adept at detecting and disregarding injected instructions. This has led to a reduction in the success rate of attacks without compromising the model's efficiency during routine operations.

Defense effectiveness 

To measure the real-world impact of defense improvements, we simulate attacks against many Workspace features. This process leverages the newly generated synthetic attack data described on this blog, to create a robust, end-to-end evaluation. The simulation is run against multiple Workspace apps, such as Gmail and Docs, using a standardized set of assets to ensure reliable results. To determine the exact impact of a defense improvement (e.g., an updated ML model or a new LLM prompt optimization), the end-to-end evaluation is run with and without the defense enabled. This comparative testing provides the essential "before and after" metrics needed to validate defense efficacy and drive continuous improvement.



Moving forward 

Our commitment to AI security is rooted in the principle that every day you’re safer with Google. While the threat landscape of indirect prompt injection evolves, we are building Workspace with Gemini to be a secure and trustworthy platform for AI-first work. IPI is a complex security challenge, which requires a defense-in-depth strategy and continuous mitigation approach. To get there, we’re combining world-class security research, automated pipelines, and advanced ML/LLM-based models. This robust and iterative framework helps to ensure we not only stay ahead of evolving threats but also provide a powerful, secure experience for both our users and customers.


Enabling user-initiated purchases, starting with higher access to advanced AI features in Workspace

We’re introducing a new capability that allows end users to purchase Google Workspace add-ons directly, starting with the AI Expanded Access add-on for users in the United States and Canada with a Google Workspace Business edition. This feature is designed to help organizations and admins address the growing demand for AI tools while simplifying how users can get higher access to the latest advanced AI tools in Workspace.

Currently, users who want more access to advanced AI capabilities - such as higher usage in the Gemini app and NotebookLM, or specialized tools like Veo 3 or Avatar creation in Google Vids - must rely on their IT administrators to procure and assign add-on licenses. This can create bottlenecks for motivated users who have adopted these tools and are eager to further boost their productivity with Google Workspace. With this update, eligible users can purchase the AI Expanded Access add-on themselves using their own form of payment and an individual billing account.

Admins retain control
You retain full control of this feature, including the ability to turn this capability off for your entire organization or specific groups. Administrators can review which users have purchased subscriptions through the Admin console, and have the ability to cancel subscriptions if they wish to do so. Visibility of user-purchased subscriptions can be a valuable source of information to central teams around broader licensing decisions while maintaining administrative visibility and control.

Getting started

  • Admins: This feature will be on by default, and can be disabled at the domain, OU, or group level. Visit the Help Center to learn more.
  • End users: If your organization is eligible and the setting has not been disabled (see availability details below), users may see an in-product prompt to upgrade to get higher access when they reach their usage limit for certain AI features in Google Workspace. Clicking this prompt will allow the user to buy the add-on directly using their own payment method. Once they’ve signed up, they’ll get access to higher usage of advanced AI features in Workspace with the AI Expanded Access add-on when using their company Workspace account. Users will be able to manage their add-on subscription by visiting the user subscription management page at workspace.google.com/individual/user/manage.

Rollout pace

  • Rapid Release and Scheduled Release domains: Currently available
  • Users in eligible organizations may see the option to upgrade to the AI Expanded Access add-on when they reach certain AI feature usage limits starting no sooner than April 7, 2026.

Availability

  • Business Editions: Initially, users with Business Standard or Business Plus licenses in the United States and Canada will be able to make these purchases. We plan to make this capability available to more customers in the coming months. Administrators will be notified by email when user-initiated purchases will be made available in their domains, and they’ll have the opportunity to adjust settings for this capability in the Admin console before the rollout of user-initiated purchases.*
*Note: This feature is not yet available to customers who have purchased Workspace from a resale partner, an offline order form, or customers outside of the United States and Canada. For these customers, AI Expanded Access licenses can be purchased by Administrators in the Admin console, through your partner, or Google account manager.

Resources

Ads DevCast Episode 2: Audience Management

Many to 1, Audience Management in Data Manager API

In today's fast-paced digital landscape, managing multiple data pipelines can feel overwhelming. If you're a developer navigating the complexities of audience management and conversion tracking, you might be wondering: how can I streamline my processes and reduce costs?

This week’s episode of Ads DevCast explores the new Data Manager API, which will transform the way you connect data across various Google ad platforms, making your workflows more efficient and cost-effective. This product solves many of the complexities developers and agencies have had to juggle across multiple APIs—each with its own schema, maintenance requirements, and learning curve.

Click here to watch the episode: https://youtu.be/Z4mAR98cizQ

About our Guest

In this episode we sit down with Melissa Ng, Product Manager for Data Manager API, to discuss how Google is streamlining these processes and what it means for your development workflow.

Melissa is a seasoned Product Manager, specializing in ads measurement at Google for the past four years. Her deep understanding of customer needs and data management challenges drives her passion for creating innovative solutions like the Data Manager API.

What are we solving for?

Every advertising product has its own unique requirements that aren't necessarily interoperable. This fragmentation creates confusion, slows down onboarding, and drives up engineering costs. It also makes it difficult to build the kind of cohesive "Data Strength" needed to fuel modern advertising and maximize conversions.

In a joint case study, one of our advertisers reported spending over $1 million annually on maintenance alone. We needed to provide a unified ingestion point for conversions and audiences, effectively reducing confusion and simplifying data management.

Enter the Data Manager API

The Data Manager API is Google's answer to this complexity. It allows data partners, agencies, and advertisers to send first-party data to multiple Google advertising products—such as Google Ads, Display & Video 360, Campaign Manager 360, Google Analytics, and others—in a single call.

"This means that when you’re sending in a field for, say, 'transaction id', that field means the same thing in all of the downstream use cases across all of the Google advertising products," explains Melissa in the episode.

While the primary motivation behind developing the Data Manager API was to address complexities associated with maintaining multiple APIs, there are other benefits as well:

  • Ease of Use and Efficiency at Scale: You can achieve significant savings in resources. Treasure Data, one of our early API integration partners, reported a 2x faster advertiser onboarding process and an 80% reduction in engineering efforts.
  • Privacy by Default: End-user privacy is non-negotiable. The Data Manager API features confidential matching right out of the box for audience solutions like Customer Match. This utilizes a Trusted Execution Environment (TEE) to ensure that no one—not even Google—can view the processed data. (Read more)
  • Future-Proofing Your Stack: The Data Manager API is the future of audience and conversion management. Any new features for audience or conversion management will be implemented in the Data Manager API going forward. (See our timelines)
  • Enhanced Diagnostics: For the first time, API stats and diagnostics will soon be available directly in the Ads UI, providing visibility into the health of your connections and downstream asynchronous errors, saving you countless hours of troubleshooting.

Tune In and Take Action

If you are managing audiences or conversions programmatically, this episode is a must-listen. It is packed with insights, roadmap updates, and advice on how to seamlessly migrate your existing systems.

Listen to the full episode of the Ads DevCast here!

Ready to get started with the Data Manager API?

  1. Start your upgrade today: Dive into the step-by-step tutorials and upgrade guides.
  2. Join the community: Connect with us and other developers on our Discord server.
  3. Share your feedback: We want to hear from you! Let us know what features you want to see next by filling out our Episode Survey. Reminder, this show is a pilot, and your feedback is important so we can tailor our content to what is most useful for you.

Thanks for tuning in! We'll see you in two weeks for Episode 3.

Android Studio supports Gemma 4: our most capable local model for agentic coding

Posted by Matthew Warner, Google Product Manager


Every developer's AI workflow and needs are unique, and it's important to be able to choose how AI helps your development. In January, we introduced the ability to choose any local or remote AI model to power AI functionality in Android Studio, and today, we're announcing the availability of Gemma 4 for AI coding assistance in Android Studio. This new local model trained on Android development provides the best of both worlds: the privacy and cost-efficiency of on-device processing alongside state-of-the-art reasoning and tool-calling capabilities.

AI assistance, locally delivered

By running locally on your machine, Gemma 4 gives you AI code assistance that doesn't require an internet connection or an API key for its core operations. Key benefits include:

  • Privacy and security: Your code stays on your machine. Gemma 4 processes all Agent Mode requests locally, making it an ideal choice for developers working with data privacy requirements or in secure corporate environments.
  • Cost efficiency: Run complex agentic workflows without worrying about hitting quotas. Gemma 4 is optimized to run efficiently on modern development hardware, utilizing local GPU and RAM to provide snappy, responsive assistance.
  • Offline availability: Use the agent to write code even when you don’t have an internet connection.
  • State-of-the-art reasoning: Gemma 4 delivers best-in-class reasoning, capable of complex multi-step coding tasks in Agent Mode.

Powerful agentic coding

Gemma 4 was trained for Android development with agentic tool calling capabilities. When you select Gemma 4 as your local model, you can leverage Agent Mode for a variety of development use cases, such as:

  • Designing new features: Developers can ask the agent to build a new feature or an entire app with commands like “build a calculator app” and the agent will not only generate the UI code but will use Android best practices like writing in Kotlin and using Jetpack Compose.
  • Refactoring: You can give high-level commands such as "Extract all hardcoded strings and migrate them to strings.xml." The agent will scan your codebase, identify instances requiring changes, and apply the edits across multiple files simultaneously.
  • Bug fixing and build resolution: If a project fails to build or has persistent lint errors, you can prompt the agent to "Build my project and fix any errors." The agent will navigate to the offending code and iteratively apply fixes until the build is successful.



Recommended hardware requirements

The 26B MoE is recommended for Android app developers using a machine with the minimum hardware requirements. Total RAM needed includes both Android Studio and Gemma.

Model Total RAM needed Storage needed
Gemma E2B 8GB 2 GB
Gemma E4B 12 GB 4 GB
Gemma 26B MoE 24 GB 17 GB

Get started

To get started, ensure you have the latest version of Android Studio installed.
  1. Install an LLM provider, such as LM Studio or Ollama, on your local computer.
  2. In Settings > Tools > AI > Model Providers add your LM Studio or Ollama instance.
  1. Download the Gemma 4 model from Ollama or LM Studio. Refer to hardware requirements for model size selection.
  2. In Agent Mode, select Gemma 4 as your active model.

For a detailed walkthrough on configuration, check out the official documentation on how to use a local model.

We are excited to see how Gemma 4 enables more private, secure, and powerful development workflows. As always, your feedback is essential as we continue to refine the AI experience in Android Studio. If you find a bug or issue, please file an issue. Also you can be part of our vibrant Android developer community on LinkedIn, YouTube, or X. Happy coding!

Announcing Gemma 4 in the AICore Developer Preview

Posted by David Chou, Product Manager and Caren Chang, Developer Relations Engineer



At Google, we’re committed to bringing the most capable AI models directly to the Android devices in your pocket. Today, we’re thrilled to announce the release of our latest state-of-the-art open model: Gemma 4.

These models are the foundation for the next generation of Gemini Nano, so code you write today for Gemma 4 will automatically work on Gemini Nano 4-enabled devices that will be available later this year. With Gemini Nano 4, you’ll benefit from our additional performance optimizations so you can ship to production across the Android ecosystem with the most efficient on-device inference.

You can get early access to this model today through the AICore Developer Preview.




















Select the Gemini Nano 4 Fast model in the Developer Preview UI
to see its blazing fast inference speed in action before you write any code

Because Gemma 4 natively supports over 140 languages, you can expect improved localized, multilingual experiences for your global audience. Furthermore, Gemma 4 offers industry-leading performance with multimodal understanding, allowing your apps to understand and process text, images, and audio. To give you the best balance of performance and efficiency, Gemma 4 on Android comes in two sizes:

  • E4B: Designed for higher reasoning power and complex tasks.
  • E2B: Optimized for maximum speed (3x faster than the E4B model!) and lower latency.

The new model is up to 4x faster than previous versions and uses up to 60% less battery. Starting today, you can experiment with improved capabilities including:

  • Reasoning: Chain-of-thought commands and conditional statements can now be expected to return higher quality results. For example: “Determine if the following comment for a discussion thread passes the community guidelines. The comment does not pass the community guideline if it contains one or more of these reason_for_flag: profanity, derogatory language, hate speech”. If the review passes the community guidelines, return {true}. Otherwise, return {false, reason_for_flag}.”
  • Math: With better math skills, the model can now more accurately answer questions. For example: “If I get 26 paychecks per year, how much should I contribute each paycheck to reach my savings goal of $10,000 over the course of a year?”
  • Time understanding: The model is now more capable when reasoning about time, making it more accurate for use cases that involve calendars, reminders, and alarms. For example: “The event is at 6PM on August 18th, and a reminder should be sent out 10 hours before the event. Return the time and date the reminder should be sent.”
  • Image understanding: Use cases that involve OCR (Optical Character Recognition) - such as chart understanding, visual data extraction, and handwriting recognition - will now return more accurate results.

Join the Developer Preview today to download these models in preview models and start building next-generation features right away.

Start building with Gemma 4

Start testing the model

You can try out the model without code by following the Developer Preview guide. If you want to jump straight into integrating these models with your existing workflow, we’ve made that seamless. Head over to Android Studio to refine your prompt and build with the familiar ML Kit Prompt API. We’ve introduced a new ability to specify a model, allowing you to target the E2B (fast) or E4B (full) variants for testing.

// Define the configuration with a specific track and preference
val previewFullConfig = generationConfig {
    modelConfig = ModelConfig {
        releaseTrack = ModelReleaseTrack.PREVIEW
        preference = ModelPreference.FULL
    }
}

// Initialize the GenerativeModel with the configuration
val previewModel = GenerativeModel.getClient(previewFullConfig)

// Verify that the specific preview model is available
val previewModelStatus = previewModel.checkStatus()
if (previewModelStatus == FeatureStatus.AVAILABLE) {
    // Proceed with inference
    val response = previewModel.generateContent("If I get 26 paychecks per year, how much I should contribute each paycheck to reach my savings goal of $10k over the course of a year? Return only the amount.")

} else {
    // Handle the case where the preview model is not available
    // (e.g., print out log statements)
}

What to expect during the Developer Preview

The goal of this Developer Preview is to give you a head start on refining prompt accuracy and exploring new use cases for your specific apps. 

We will be making several updates throughout the preview period, including support for tool calling, structured output, system prompts, and thinking mode in Prompt API, making it easier to take full advantage of the new capabilities and significant performance optimizations in Gemma 4.

The preview models are available for testing on AICore-enabled devices. These models will run on the latest generation of specialized AI accelerators from Google, MediaTek, and Qualcomm Technologies. On other devices, the models will initially run on a CPU implementation that is not representative of final production performance. If your device is not AICore-enabled, you can also test these models via the AI Edge Gallery app. We’ll provide support for more devices in the future.

How to get started

Ready to see what Gemma 4 can do for your users?

  1. Opt-in: Sign up for the AICore Developer Preview.
  2. Download: Once opted in, you can trigger the download of the latest Gemma 4 models directly to your supported test device.
  3. Build: Update your ML Kit implementation to target the new models and start building in Android Studio.