Author Archives: Google Developers Blog

Imagen 4 is now available in the Gemini API and Google AI Studio

Imagen 4, Google's advanced text-to-image model, is now available in paid preview via the Gemini API and Google AI Studio, offering significant quality improvements, especially for text generation within images. The Imagen 4 family includes Imagen 4 for general tasks and Imagen 4 Ultra for high-precision prompt adherence, with all generated images featuring a non-visible SynthID watermark.

Gemini 2.5 for robotics and embodied intelligence

Gemini 2.5 Pro and Flash are transforming robotics by enhancing coding, reasoning, and multimodal capabilities, including spatial understanding. These models are used for semantic scene understanding, code generation for robot control, and building interactive applications with the Live API, with a strong emphasis on safety improvements and community applications.

Multilingual innovation in LLMs: How open models help unlock global communication

Developers adapt LLMs like Gemma for diverse languages and cultural contexts, demonstrating AI's potential to bridge global communication gaps by addressing challenges like translating ancient texts, localizing mathematical understanding, and enhancing cultural sensitivity in lyric translation.

Gemini Code Assist in Apigee API Management now generally available

Gemini Code Assist in Apigee API Management enhances API development with AI-assisted features like natural language API creation, AI-generated summaries, and iterative design, allowing seamless integration with your organization's existing API ecosystem and ensuring consistency, security, and reduced duplication, while offering enterprise-grade security and a streamlined development workflow.

Exploring the Magic Mirror: an interactive experience powered by the Gemini models

The Magic Mirror project utilizes the Gemini API, including the Live API, Function Calling, and Grounding with Google Search, to create an interactive and dynamic experience, demonstrating the power of the Gemini models to generate visuals, tell stories, and provide real-time information through a familiar object.