Kubernetes 1.30 is now available in GKE in record time

Kubernetes 1.30 is now available in the Google Kubernetes Engine (GKE) Rapid Channel less than 20 days after the OSS release! For more information about the content of Kubernetes 1.30, read the Kubernetes 1.30 Release Notes and the specific GKE 1.30 Release Notes.


Control Plane Improvements

We're excited to announce that ValidatingAdmissionPolicy graduates to GA in 1.30. This is an exciting feature that enables many admission webhooks to be replaced with policies defined using the Common Expression Language (CEL) and evaluated directly in the kube-apiserver. This feature benefits both extension authors and cluster administrators by dramatically simplifying the development and operation of admission extensions. Many existing webhooks may be migrated to validating admission policies. For webhooks not ready or able to migrate, Match Conditions may be added to webhook configurations using CEL rules to pre-filter requests to reduce webhooks invocations.

Validation Ratcheting makes CustomResourceDefinitions even safer and easier to manage. Prior to Kubernetes 1.30, when updating a custom resource, validation was required to pass for all fields, even fields not changed by the update. Now, with this feature, only fields changed in the custom resource by an update request must pass validation. This limits validation failures on update to the changed portion of the object, and reduces the risk of controllers getting stuck when a CustomResourceDefinition schema is changed, either accidentally or as part of an effort to increase the strictness of validation.

Aggregated Discovery graduates to GA in 1.30, dramatically improving the performance of clients, particularly kubectl, when fetching the API information needed for many common operations. Aggregated discovery reduces the fetch to a single request and allows caches to be kept up-to-date by offering ETags that clients can use to efficiently poll the server for changes.


Data Plane Improvements

Dynamic Resource Allocation (DRA) is an alpha Kubernetes feature added in 1.26 that enables flexibility in configuring, selecting, and allocating specialized devices for pods. Feedback from SIG Scheduling and SIG Autoscaling revealed that the design needed revisions to reduce scheduling latency and fragility, and to support cluster autoscaling. In 1.30, the community introduced a new alpha design, DRA Structured Parameters, which takes the first step towards these goals. This is still an alpha feature with a lot of changes expected in upcoming releases. The newly formed WG Device Management has a charter to improve device support in Kubernetes - with a focus on GPUs and similar hardware - and DRA is a key component of that support. Expect further enhancements to the design in another alpha in 1.31. The working group has a goal of releasing some aspects to beta in 1.32.


Kubernetes continues the effort of eliminating perma-beta features: functionality that has long been used in production, but still wasn’t marked as generally available. With this release, AppArmor support got some attention and got closer to the final being marked as GA.

There are also quality of life improvements in Kubernetes Data Plane. Many of them will be only noticeable for system administrators and not particularly helpful for GKE users. This release, however, a notable Sleep Action KEP entered beta stage and is available on GKE. It will now be easier to use slim images while allowing graceful connections draining, specifically for some flavors of nginx images.

Acknowledgements

We want to thank all the Googlers that provide their time, passion, talent and leadership to keep making Kubernetes the best container orchestration platform. From the features mentioned in this blog, we would like to mention especially: Googlers Cici Huang, Joe Betz, Jiahui Feng, Alex Zielenski, Jeffrey Ying, John Belamaric, Tim Hockin, Aldo Culquicondor, Jordan Liggitt, Kuba Tużnik, Sergey Kanzhelev, and Tim Allclair.

Posted by Federico Bongiovanni – Google Kubernetes Engine

Empowering Women in Tech: A Personal Journey with Tech-Moms


Today on the GFiber Blog we’re featuring a guest post from Tech-Moms, one of our incredible Utah partner nonprofit organizations dedicated to helping women transition into tech and achieve long-term success in their careers through training and counseling. As a past cohort graduate, Katie Swenson, the organization’s Social Media Marketing Manager, reflects on her journey to Tech-Moms and why the recent partnership with GFiber is a game-changer. 

Thumbnail

My journey into the tech world wasn’t typical. I didn’t major in computer science or dream of coding as a child. Instead, I took what some might call the “traditional” route in life. After graduating with a B.S. in Human Nutrition from Southern Utah University in 2011, I got married, and embarked on a career in sales for a large food distributor. For seven years, I honed my skills in empathy, accountability, customer service, and negotiation, all while raising three young children with my husband. 

In 2020 the pandemic hit, and the restaurant industry, my primary clientele, suffered overnight. I found myself scrambling to help these businesses stay afloat while still juggling the demands of motherhood. The constant blur of balancing work and home life was taking its toll, and I knew I needed a change. 

In June 2020, I made the decision to leave my corporate job and become a stay-at-home mom. It was a leap of faith, knowing it would mean a significant pay cut for my family. Unfortunately, just two months later, I lost my mother to Mesothelioma after a 6 year battle. Amidst grief and uncertainty, I stumbled upon an article about Tech-Moms. 

Tech-Moms? Learning to code? It sounded intimidating at first, but something inside me urged me to sign up for their spring 2021 cohort. And it turned out to be one of the best decisions I ever made. Over the course of nine Saturday classes, I learned to design a website from scratch and found a supportive community of women that I never knew existed. 

This experience ignited a passion within me. I just knew I wanted to be a part of Tech-Moms in any way I could. When a part-time position for their social media marketing manager opened up, I jumped at the opportunity, despite not having much experience in the field and being pregnant with my fourth child. Two years later, I’m still learning and growing every day, surrounded by an incredible team while witnessing lives being transformed within this community, just like mine was. Being present with my children has been an immense blessing. There is no shortage of memories being made. 

With GFiber’s support, we are reaching even more women and able to provide them with the resources and opportunities they need to succeed in tech. Collectively, we’re breaking down barriers and empowering women to pursue their dreams, regardless of their background or previous experience. 

At Tech-Moms, our mission is simple yet profound: to empower women to thrive in the tech industry. Whether you’re a stay-at-home mom looking to reenter the workforce, or a seasoned professional seeking a change, we’re here to support you every step of the way. Learn more about Tech-Moms at tech-moms.org

Posted by Katie Swenson, Social Media Marketing Manager











Tech-Moms is always accepting applications on our website. We run cohorts each fall and winter: September through November and January through March. There are two options for our core training program: in person each Saturday for 9 weeks (Utah only), and a virtual course taught in 2 sessions per week (generally one weeknight and Saturday morning for 9 weeks). Course tuition is $400 (financial assistance is available). Our students learn basic front-end web development, including HTML, CSS, and Javascript, and build their own simple website as a final project. We also have extensive career exploration opportunities during the course. Upon completion, each student will receive additional support in choosing their next steps, including additional training and education, job search assistance, or both. Our graduates are moving confidently into tech roles across all industries, all with the long term support of our Tech-Moms community. We truly live by our motto: Once a Tech-Mom, Always a Tech-Mom. 




Get notifications for all messages in a Google Chat space

What’s changing

In the last year, we’ve made numerous improvements to Google Chat that help you stay on top of the busy flow of communication and make it easier to prioritize and find the conversations that are most important to you. However, there are some conversations where you always need to be notified, like spaces dealing with customer support or operational issues. 

For conversations that require a higher level of attention, we’re introducing a new “notify all” functionality for in-line threaded spaces. If this option is selected, you will be notified of all new messages in the space. This includes receiving notifications for all @ mentions, threads followed, and even threads that you do not follow, allowing you to stay on top of everything happening in a conversation. 

The options within notification settings are being updated to: “All”, “Main conversations”, “For you”, and “None” so that you can better tailor your notifications preferences for in-line threaded spaces. 


Get notifications for all messages in a Google Chat space


Getting started 

  • Admins: There is no admin control for this feature. 
  • End users: To update your notification settings in a space, click the three dots (more options) next to the space name > Notification settings > select an option for notifications. Or you can click the space header > Notifications > select an option for notifications. Visit the Help Center to learn more about customizing notifications for a space with in-line threading. 

Rollout pace 

Web: 
  • Rapid Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on May 10, 2024 
  • Scheduled Release domains: Gradual rollout (up to 15 days for feature visibility) starting on June 4, 2024 
Android: 
iOS: 

Availability 

  • Available to all Google Workspace customers and Workspace Individual Subscribers 

Resources 

Set the default camera framing option for Google Meet hardware devices, and other framing updates

What’s changing

We’re introducing several updates around framing controls for Google Meet hardware devices:


First, we’re introducing an admin setting which will allow admins to choose a default framing option for their meeting spaces, ensuring every meeting begins with an optimally configured view. This will help your users jump right into their meetings without having to re-adjust camera settings from the previous meeting. This can be set individually for each device or via the bulk updates across your fleet.

Setting the default camera framing option in the Admin console




Next, we’re adding framing support on whiteboards (Series One Desk 27 and Board 65) and remote controlled only Google Meet hardware devices, which will help ensure optimal camera framing on these devices.


Remote control framing user interfaceWhiteboard framing user interface




Finally, we’re making a few small adjustments to how camera framing settings appear on hardware devices. For Meet on Android, we’re removing the “Continuous framing” toggles and replacing them with a “Framing by” toggle. Depending on the third-party devices you’re using, you’ll see “Framing by Logitech”,“Framing by Huddly” or “Framing by Poly”, for example. We’re also changing the “Home” button to “Reset to default”.
Updated camera framing settings on Meet hardware devices



Getting started

  • Admins: You can configure default camera framing options for individual Google Meet hardware devices by going to Devices > Google Meet hardware > [Device Name] > Device Settings > Default camera framing. Or you can set the default camera framing option for multiple devices at once
  • End users: Visit the Help Center to learn more about using device-based framing and using the Meet the touchscreen to control audio and video.

Rollout pace

  • Whiteboard and remote control device support
    • Rapid and Scheduled Release domains: Gradual rollout (up to 15 days for feature visibility) starting on May 14, 2024

  • Admin control:
    • Rapid and Scheduled Release domains: Gradual rollout (up to 15 days for feature visibility) starting on May 21, 2024

Availability

  • Available to all Google Workspace customers

Resources


OpenXLA Dev Lab 2024: Building Groundbreaking ML Systems Together


AMD, Arm, AWS, Google, NVIDIA, Intel, Tesla, SambaNova, and more come together to crack the code for colossal AI workloads

As AI models grow increasingly complex and compute-intensive, the need for efficient, scalable, and hardware-agnostic infrastructure has never been greater. OpenXLA is a deep learning compiler framework that makes it easy to speed up and massively scale AI models on a wide range of hardware types—from GPUs and CPUs to specialized chips like Google TPUs and AWS Trainium. It is compatible with popular modeling frameworks—JAX, PyTorch, and TensorFlow—and delivers leading performance. OpenXLA is the acceleration infrastructure of choice for global-scale AI-powered products like Amazon.com Search, Google Gemini, Waymo self-driving vehicles, and x.AI's Grok.


The OpenXLA Dev Lab

On April 25th, the OpenXLA Dev Lab played host to over 100 expert ML practitioners from 10 countries, representing industry leaders like AMD, Arm, AWS, ByteDance, Cerebras, Cruise, Google, NVIDIA, Intel, Tesla, SambaNova, and more. The full-day event, tailored to AI hardware vendors and infrastructure engineers, broke the mold of previous OpenXLA Summits by focusing purely on “Lab Sessions”, akin to office hours for developers, and hands-on Tutorials. The energy of the event was palpable as developers worked side-by-side, learning and collaborating on both practical challenges and exciting possibilities for AI infrastructure.

World map showing where developers come from across countries to the OpenXLA Dev Lab
Figure 1: Developers from around the world congregated at the OpenXLA Dev Lab.

The Dev Lab was all about three key things:

  • Educate and Empower: Teach developers how to implement OpenXLA's essential workflows and advanced features through hands-on tutorials.
  • Offer Expert Guidance: Provide personalized office hours led by OpenXLA experts to help developers refine their ideas and contributions.
  • Foster Community: Encourage collaboration, knowledge-sharing, and lasting connections among the brilliant minds in the OpenXLA community.

Tutorials

The Tutorials included:

Integrating an AI Compiler & Runtime into PJRT

  • Learn how PJRT connects ML frameworks to AI accelerators, standardizing their interaction for easy model deployment on diverse hardware.
  • Explore the PJRT C API for framework-hardware communication.
  • Implement a PJRT Plugin, a Python package that implements the C API.
  • Discover plugin examples for Apple Metal, CUDA, Intel GPU, and TPU.

Led by Jieying Luo and Skye Wanderman-Milne


Extracting StableHLO Graphs + Intro to StableHLO Quantizer

  • Learn to export StableHLO from JAX, PyTorch, and TensorFlow using static/dynamic shapes and SavedModel format.
  • Hack along with the tutorial using the JAX, PyTorch, and TensorFlow Colab notebooks provided on OpenXLA.org.
  • Simplify quantization with StableHLO Quantizer; a framework and device-agnostic tool.
  • Explore streamlined parameter selection and model rewriting for lower precision.

Led by Kevin Gleason Jen Ha, and Xing Liu


Optimizing PyTorch/XLA Auto-sharding for Your Hardware

  • Discover this experimental feature that automates distributing large-scale PyTorch models across XLA devices.
  • Learn how it partitions and distributes for out-of-the-box performance without manual intervention
  • Explore future directions such as customizable cost models for different hardware

Led by Yeounoh Chung and Pratik Fegade


Optimizing Compute and Communication Scheduling with XLA

  • Scale ML models on multi-GPUs with SPMD partitioning, collective communication, HLO optimizations.
  • Explore tensor parallelism, latency hiding scheduler, pipeline parallelism.
  • Learn collective optimizations, pipeline parallelism for efficient large-scale training.

Led by Frederik Gossen, TJ Xu, and Abhinav Goel


Lab Sessions

Lab Sessions featured use case-specific office hours for AMD, Arm, AWS, ByteDance, Intel, NVIDIA, SambaNova, Tesla, and more. OpenXLA engineers were on hand to provide development teams with dedicated support and walkthrough specific pain points and designs. In addition, Informational Roundtables that covered broader topics like GPU ML Performance Optimization, JAX, and PyTorch-XLA GPU were available for those without specific use cases. This approach led to productive exchanges and fine-grained exploration of critical contribution areas for ML hardware vendors.

four photos of participants and vendors at OpenXLA Dev Lab

Don’t just take our word for it – here’s some of the feedback we received from developers:

"PJRT is awesome, we're looking forward to building with it. We are very grateful for the support we are getting." 
      — Mark Gottscho, Senior Manager and Technical Lead at SambaNova
"Today I learned a lot about Shardonnay and about some of the bugs I found in the GSPMD partitioner, and I got to learn a lot of cool stuff." 
      — Patrick Toulme, Machine Learning Engineer at AWS
“I learned a lot, a lot about how XLA is making tremendous progress in building their community.” 
      — Tejash Shah, Product Manager at NVIDIA
“Loved the format this year - please continue … lots of learning, lots of interactive sessions. It was great!” 
      — Om Thakkar, AI Software Engineer at Intel

Technical Innovations and The Bold Road Ahead

The event kicked off with a keynote by Robert Hundt, Distinguished Engineer at Google, who outlined OpenXLA's ambitious plans for 2024, particularly three major areas of focus:,/p>

  • Large-scale training
  • GPU and PyTorch compute performance
  • Modularity and extensibility

Empowering Large-Scale Training

OpenXLA is introducing powerful features to enable model training at record-breaking scales. One of the most notable additions is Shardonnay, a tool coming soon to OpenXLA that automates and optimizes how large AI workloads are divided across multiple processing units, ensuring efficient use of resources and faster time to solution. Building on the success of its predecessor, SPMD, Shardonnay empowers developers with even more fine-grained control over partitioning decisions, all while maintaining the productivity benefits that SPMD is known for.

Diagram of sharding representation with a simple rank 2 tensor and 4 devices.
Figure 2: Sharding representation example with a simple rank 2 tensor and 4 devices.

In addition to Shardonnay, developers can expect a suite of features designed to optimize computation and communication overlap, including:

  • Automatic profile-guided latency estimation
  • Collective pipelining
  • Heuristics-based collective combiners

These innovations will enable developers to push the boundaries of large-scale training and achieve unprecedented performance and efficiency.


OpenXLA Delivers on TorchBench Performance

OpenXLA has also made significant strides in enhancing performance, particularly on GPUs with key PyTorch-based generative AI models. PyTorch-XLA GPU is now neck and neck with TorchInductor for TorchBench Full Graph Models and has a TorchBench pass rate within 5% of TorchInductor.

A bar graph showing a performance comparison of TorchInductor vs. PyTorch-XLA GPU on Google Cloud NVIDIA H100 GPUs
Figure 3: Performance comparison of TorchInductor vs. PyTorch-XLA GPU on Google Cloud NVIDIA H100 GPUs. “Full graph models” represent all TorchBench models that can be fully represented by StableHLO

Behind these impressive gains lies XLA GPU's global cost model, a game-changer for developers. In essence, this cost model acts as a sophisticated decision-making system, intelligently determining how to best optimize computations for specific hardware. The cost model delivers state-of-the-art performance through a priority-based queue for fusion decisions and is highly extensible, allowing third-party developers to seamlessly integrate their backend infrastructure for both general-purpose and specialized accelerators. The cost model's adaptability ensures that computation optimizations are tailored to specific accelerator architectures, while less suitable computations can be offloaded to the host or other accelerators.

OpenXLA is also breaking new ground with novel kernel programming languages, Pallas and Mosaic, which empower developers to write highly optimized code for specialized hardware. Mosaic demonstrates remarkable efficiency in programming key AI accelerators, surpassing widely used libraries in GPU code generation efficiency for models with 64, 128, and 256 Q head sizes, as evidenced by its enhanced utilization of TensorCores.

A bar graph showing a performance comparison of Flash Attention vs. Mosaic GPU on NVIDIA H100 GPUs
Figure 4: Performance comparison of Flash Attention vs. Mosaic GPU on NVIDIA H100 GPUs.

Modular and Extensible AI Development

In addition to performance enhancements, OpenXLA is committed to making the entire stack more modular and extensible. Several initiatives planned for 2024 include:

  • Strengthening module interface contracts
  • Enhancing code sharing between platforms
  • Enabling a shared high-level compiler flow through runtime configuration and component registries

A flow diagram showing modules and subcomponents of the OpenXLA stack.
Figure 5: Modules and subcomponents of the OpenXLA stack.

These improvements will make it easier for developers to build upon and extend OpenXLA.

Alibaba's success with PyTorch XLA FSDP within their TorchAcc framework is a prime example of the benefits of OpenXLA's modularity and extensibility. By leveraging these features, Alibaba achieved state-of-the-art performance for the LLaMa 2 13B model, surpassing the previous benchmark set by Megatron. This demonstrates the power of the developer community in extending OpenXLA to push the boundaries of AI development.

A bar graph showing a performance comparison of TorchAcc and Megatron for  LLaMa 2 13B at different number of GPUs.
Figure 6: Performance comparison of TorchAcc and Megatron for LLaMa 2 13B at different numbers of GPUs.

Join the OpenXLA Community

If you missed the Dev Lab, don't worry! You can still access StableHLO walkthroughs on openxla.org, as well as the GitHub Gist for the PJRT session. Additionally, the recorded keynote and tutorials are available on our YouTube channel. Explore these resources and join our global community – whether you're an AI systems expert, model developer, student, or just starting out, there's a place for you in our innovative ecosystem.

four photos of participants and vendors at OpenXLA Dev Lab

Acknowledgements

Adam Paske, Allen Hutchison, Amin Vahdat, Andrew Leaver, Andy Davis, Artem Belevich, Abhinav Goel, Benjamin Kramer, Berkin Illbeyi, Bill Jia, Eugene Zhulenev, Florian Reichl, Frederik Gossen, George Karpenkov, Gunhyun Park, Han Qi, Jack Cao, Jaesung Chung, Jen Ha, Jianting Cao, Jieying Luo, Jiewin Tan, Jini Khetan, Kevin Gleason, Kyle Lucke, Kuy Mainwaring, Lauren Clemens, Manfei Bai, Marisa Miranda, Michael Levesque-Dion, Milad Mohammadi, Nisha Miriam Johnson, Penporn Koanantakool, Robert Hundt, Sandeep Dasgupta, Sayce Falk, Shauheen Zahirazami, Skye Wanderman-milne, Yeounoh Chung, Pratik Fegade, Peter Hawkins, Vaibhav Singh, Tamás Danyluk, Thomas Jeorg, Adam Paszke and TJ Xu.

By James Rubin, Aditi Joshi, and Elliot English on behalf of the OpenXLA Project

Breakout room information is now included in Google Meet attendance reports

What’s changing 

We’re now including breakout room attendance as part of attendance reporting in Google Meet. Attendance reports help meeting organizers keep track of who attended their meetings and for how long, which can be challenging during larger meetings or while presenting. This becomes more complicated when using breakout rooms to divide meeting participants into smaller groups. Adding breakout room attendance makes for a more comprehensive report and reduces the burden on meeting hosts to track breakout room attendance manually.


Getting started

  • Admins: Visit the Help Center to learn more about letting organizers get reports on meeting attendance.
  • End users: When enabled by your admin, attendance reports will automatically be sent to the meeting host. Attendance reports for breakout rooms will be in their own tab in the spreadsheet. Visit the Help Center to learn more about attendance tracking

Rollout pace


Availability

Available to Google Workspace
  • Essentials
  • Business Plus
  • Enterprise Starter, Essentials, Standard, and Plus
  • Education Plus and the Teaching and Learning Upgrade

Resources


New ways to quickly format and organize data with Tables in Google Sheets

This announcement was part of Google Cloud Next ‘24. Visit the Workspace Blog to learn more about the next wave of innovations in Workspace, including enhancements to Gemini for Google Workspace.


What’s changing

We know it can be time consuming to perform repetitive tasks like updating data in a spreadsheet. In addition, maintaining the structure and format of the data can be difficult when there are multiple people updating the document.

To help solve for this, we’re excited to announce tables in Google Sheets. With tables, you can simplify and accelerate spreadsheet building by bringing format and structure to unorganized ranges. By selecting your data range and going to Format > Convert to table, Sheets now does the heavy lifting to format and organize data with a polished design including column types, filters, color coding, dropdown menus and more. 
Convert to table in Google Sheets
Here’s how using tables reduces the time you would usually spend manually formatting data: 
  • Auto-applied formatting: When you convert your data to a table, Sheets automatically applies formatting to polish your data so that all inputs are properly aligned, reducing the need for manual changes. You can further customize your table by changing colors, readjusting the row height, and more. 
  • Column types: For each column, you can set the appropriate column type (e.g., date, currency, dropdown), and your table will make sure all entered data has the right formatting based on the column type. Data entered that does not align with a set column type will result in a warning. 
  • Unified menu: Above the table, you will see a menu option to manage table-level settings (e.g., adjust table range) and take action (e.g., create a filter view for your table). 
  • Table references: Table references are a special way to refer to a table or parts of a table in a formula. When you convert your data to a table, Sheets provides a name for the table and each column header. When you reference table elements by name, the references update whenever you add or remove data from the table. For example: Instead of explicit cell references: =COUNTIF(B2:B10, “P0”), you can use table references: =COUNTIF(Task_tracker[Priority], “P0”). 
When you are using tables, you’ll also have access to our new type of view, group by, where you can aggregate your data into groups based on a selected column. For instance, you can decide to group all data at the same priority level in one place, as shown below.
Group by view in Tables in Google Sheets
We’re also introducing pre-built tables that you can populate with common data types for everyday tasks like project management, inventory management, event planning and more. Now with pre-built tables, you never have to build a spreadsheet from scratch again. 
Pre-built tables in Sheets

Who’s impacted

End users 


Why it matters 

Tables will transform the way teams organize their data, simplify data creation, and reduce the repetitive tasks needed to format, input, and update data. They also allow teams to confidently share data widely while maintaining its integrity and consistency. 

Tables are well suited for tracking and organizing information such as project tracking, event planning, and inventory management. 

Getting started 

Rollout pace 

  • Rapid Release domains: Extended rollout (potentially longer than 15 days for feature visibility) starting on May 8, 2024, with expected completion by May 30, 2024 
  • Scheduled Release domains: Gradual rollout (up to 15 days for feature visibility) starting on June 6, 2024 

Availability 

  • Available to all Google Workspace customers, Google Workspace Individual subscribers, and users with personal Google accounts 

Resources