TensorFlow Lite, now named LiteRT, is still the same high-performance runtime for on-device AI, but with an expanded vision to support models authored in PyTorch, JAX, and Keras.
Controlled Generation for Gemini 1.5 Pro and Flash improves the handoff from data science teams to developers, enhancing the integration of AI output and ensuring AI-generated responses adhere to a defined schema.
RecurrentGemma architecture showcases a hybrid model that mixes gated linear recurrences with local sliding window attention; a highly valuable feature when you're concerned about exhausting your LLM's context window.
Gemma 2 is a new suite of open models that sets a new standard for performance and accessibility, outperforming popular models more than twice its size.
Use the Gemma language model to gauge customer sentiment, summarize conversations, and assist with crafting responses in near real-time with minimal latency.
Learn more about the different variations of Gemma models, how they are designed for different use cases, and the core parameters of their architecture.
Create a text-based adventure game using Gemma 2. Here are code snippets and tips for designing the game world, enhancing interactivity and replayability, and more.
XNNPack, the default TensorFlow Lite CPU inference engine, has been updated to improve performance and memory management, allow cross-process collaboration, and simplify the user-facing API.
Gemini 1.5 Flash is now available to developers at more than 70% lower prices. Set up billing for Gemini API in Google AI Studio and access other new features like 1.5 Flash tuning.