Author Archives: Ben Treynor Sloss

Expanding our global infrastructure with new regions and subsea cables

At Google, we've spent $30 billion improving our infrastructure over three years, and we’re not done yet. From data centers to subsea cables, Google is committed to connecting the world and serving our Cloud customers, and today we’re excited to announce that we’re adding three new submarine cables, and five new regions.

We’ll open our Netherlands and Montreal regions in the first quarter of 2018, followed by Los Angeles, Finland, and Hong Kong – with more to come. Then, in 2019 we’ll commission three subsea cables: Curie, a private cable connecting Chile to Los Angeles; Havfrue, a consortium cable connecting the U.S. to Denmark and Ireland; and the Hong Kong-Guam Cable system (HK-G), a consortium cable interconnecting major subsea communication hubs in Asia.  

Together, these investments further improve our network—the world’s largest—which by some accounts delivers 25% of worldwide internet traffic. Companies like PayPal leverage our network and infrastructure to run their businesses effectively.

“At PayPal, we process billions of transactions across the globe, and need to do so securely, instantaneously and economically. As a result, security, networking and infrastructure were key considerations for us when choosing a cloud provider,” said Sri Shivananda, PayPal’s Senior Vice President and Chief Technology Officer. “With Google Cloud, we have access to the world’s largest network, which helps us reach our infrastructure goals and best serve our millions of users.”

infrastructure-1
Figure 1. Diagram shows existing GCP regions and upcoming GCP regions
infrastructure-2
Figure 2. Diagram shows three new subsea cable investments, expanding capacity to Chile, Asia Pacific and across the Atlantic

Curie cable

Our investment in the Curie cable (named after renowned scientist Marie Curie) is part of our ongoing commitment to improve global infrastructure. In 2008, we were the first tech company to invest in a subsea cable as a part of a consortium. With Curie, we become the first major non-telecom company to build a private intercontinental cable.

By deploying our own private subsea cable, we help improve global connectivity while providing value to our customers. Owning the cable ourselves has some distinct benefits. Since we control the design and construction process, we can fully define the cable’s technical specifications, streamline deployment and deliver service to users and customers faster. Also, once the cable is deployed, we can make routing decisions that optimize for latency and availability.

Curie will be the first subsea cable to land in Chile in almost 20 years. Once deployed, Curie will be Chile’s largest single data pipe. It will serve Google users and customers across Latin America.

Havfrue cable

To increase capacity and resiliency in our North Atlantic systems, we’re working with Facebook, Aqua Comms and Bulk Infrastructure to build a direct submarine cable system connecting the U.S. to Denmark and Ireland. This cable, called Havfrue (Danish for “mermaid”), will be built by TE SubCom and is expected to come online by the end of 2019. The marine route survey, during which the supplier determines the specific route the cable will take, is already underway.

HK-G cable

In the Pacific, we’re working with RTI-C and NEC on the Hong Kong-Guam cable system. Together with Indigo and other existing subsea systems, this cable creates multiple scalable, diverse paths to Australia, increasing our resilience in the Pacific. As a result, customers will experience improved capacity and latency from Australia to major hubs in Asia. It will also increase our network capacity at our new Hong Kong region.
infrastructure-3

Figure 3. A complete list of Google’s subsea cable investments. New cables in this announcement are highlighted yellow. Google subsea cables provide reliability, speed and security not available from any other cloud.

Google has direct investment in 11 cables, including those planned or under construction. The three cables highlighted in yellow are being announced in this blog post. (In addition to these 11 cables where Google has direct ownership, we also lease capacity on numerous additional submarine cables.)

What does this mean for our customers?

These new investments expand our existing cloud network. The Google network has over 100 points of presence (map) and over 7,500 edge caching nodes (map). This investment means faster and more reliable connectivity for all our users.

Simply put, it wouldn’t be possible to deliver products like Machine Learning Engine, Spanner, BigQuery and other Google Cloud Platform and G Suite services at the quality of service users expect without the Google network. Our cable systems provide the speed, capacity and reliability Google is known for worldwide, and at Google Cloud, our customers are able to to make use of the same network infrastructure that powers Google’s own services.

While we haven’t hastened the speed of light, we have built a superior cloud network as a result of the well-provisioned direct paths between our cloud and end-users, as shown in the figure below.

infrastructure-4

Figure 4. The Google network offers better reliability, speed and security performance as compared with the nondeterministic performance of the public internet, or other cloud networks. The Google network consists of fiber optic links and subsea cables between 100+ points of presence, 7500+ edge node locations, 90+ Cloud CDN  locations, 47 dedicated interconnect locations and 15 GCP regions.

We’re excited about these improvements. We're increasing our commitment to ensure users have the best connections in this increasingly connected world.

Source: Google Cloud


Expanding our global infrastructure with new regions and subsea cables

At Google, we've spent $30 billion improving our infrastructure over three years, and we’re not done yet. From data centers to subsea cables, Google is committed to connecting the world and serving our Cloud customers, and today we’re excited to announce that we’re adding three new submarine cables, and five new regions.

We’ll open our Netherlands and Montreal regions in the first quarter of 2018, followed by Los Angeles, Finland, and Hong Kong – with more to come. Then, in 2019 we’ll commission three subsea cables: Curie, a private cable connecting Chile to Los Angeles; Havfrue, a consortium cable connecting the U.S. to Denmark and Ireland; and the Hong Kong-Guam Cable system (HK-G), a consortium cable interconnecting major subsea communication hubs in Asia.  

Together, these investments further improve our network—the world’s largest—which by some accounts delivers 25% of worldwide internet traffic. Companies like PayPal leverage our network and infrastructure to run their businesses effectively.

“At PayPal, we process billions of transactions across the globe, and need to do so securely, instantaneously and economically. As a result, security, networking and infrastructure were key considerations for us when choosing a cloud provider,” said Sri Shivananda, PayPal’s Senior Vice President and Chief Technology Officer. “With Google Cloud, we have access to the world’s largest network, which helps us reach our infrastructure goals and best serve our millions of users.”

infrastructure-1
Figure 1. Diagram shows existing GCP regions and upcoming GCP regions
infrastructure-2
Figure 2. Diagram shows three new subsea cable investments, expanding capacity to Chile, Asia Pacific and across the Atlantic

Curie cable

Our investment in the Curie cable (named after renowned scientist Marie Curie) is part of our ongoing commitment to improve global infrastructure. In 2008, we were the first tech company to invest in a subsea cable as a part of a consortium. With Curie, we become the first major non-telecom company to build a private intercontinental cable.

By deploying our own private subsea cable, we help improve global connectivity while providing value to our customers. Owning the cable ourselves has some distinct benefits. Since we control the design and construction process, we can fully define the cable’s technical specifications, streamline deployment and deliver service to users and customers faster. Also, once the cable is deployed, we can make routing decisions that optimize for latency and availability.

Curie will be the first subsea cable to land in Chile in almost 20 years. Once deployed, Curie will be Chile’s largest single data pipe. It will serve Google users and customers across Latin America.

Havfrue cable

To increase capacity and resiliency in our North Atlantic systems, we’re working with Facebook, Aqua Comms and Bulk Infrastructure to build a direct submarine cable system connecting the U.S. to Denmark and Ireland. This cable, called Havfrue (Danish for “mermaid”), will be built by TE SubCom and is expected to come online by the end of 2019. The marine route survey, during which the supplier determines the specific route the cable will take, is already underway.

HK-G cable

In the Pacific, we’re working with RTI-C and NEC on the Hong Kong-Guam cable system. Together with Indigo and other existing subsea systems, this cable creates multiple scalable, diverse paths to Australia, increasing our resilience in the Pacific. As a result, customers will experience improved capacity and latency from Australia to major hubs in Asia. It will also increase our network capacity at our new Hong Kong region.
infrastructure-3

Figure 3. A complete list of Google’s subsea cable investments. New cables in this announcement are highlighted yellow. Google subsea cables provide reliability, speed and security not available from any other cloud.

Google has direct investment in 11 cables, including those planned or under construction. The three cables highlighted in yellow are being announced in this blog post. (In addition to these 11 cables where Google has direct ownership, we also lease capacity on numerous additional submarine cables.)

What does this mean for our customers?

These new investments expand our existing cloud network. The Google network has over 100 points of presence (map) and over 7,500 edge caching nodes (map). This investment means faster and more reliable connectivity for all our users.

Simply put, it wouldn’t be possible to deliver products like Machine Learning Engine, Spanner, BigQuery and other Google Cloud Platform and G Suite services at the quality of service users expect without the Google network. Our cable systems provide the speed, capacity and reliability Google is known for worldwide, and at Google Cloud, our customers are able to to make use of the same network infrastructure that powers Google’s own services.

While we haven’t hastened the speed of light, we have built a superior cloud network as a result of the well-provisioned direct paths between our cloud and end-users, as shown in the figure below.

infrastructure-4

Figure 4. The Google network offers better reliability, speed and security performance as compared with the nondeterministic performance of the public internet, or other cloud networks. The Google network consists of fiber optic links and subsea cables between 100+ points of presence, 7500+ edge node locations, 90+ Cloud CDN  locations, 47 dedicated interconnect locations and 15 GCP regions.

We’re excited about these improvements. We're increasing our commitment to ensure users have the best connections in this increasingly connected world.

Protecting our Google Cloud customers from new vulnerabilities without impacting performance

If you’ve been keeping up on the latest tech news, you’ve undoubtedly heard about the CPU security flaw that Google’s Project Zero disclosed last Wednesday. On Friday, we answered some of your questions and detailed how we are protecting Cloud customers. Today, we’d like to go into even more detail on how we’ve protected Google Cloud products against these speculative execution vulnerabilities, and what we did to make sure our Google Cloud customers saw minimal performance impact from these mitigations.

Modern CPUs and operating systems protect programs and users by putting a “wall" around them so that one application, or user, can’t read what’s stored in another application’s memory. These boundaries are enforced by the CPU.

But as we disclosed last week, Project Zero discovered techniques that can circumvent these protections in some cases, allowing one application to read the private memory of another, potentially exposing sensitive information.

The vulnerabilities come in three variants, each of which must be protected against individually. Variant 1 and Variant 2 have also been referred to as “Spectre.” Variant 3 has been referred to as “Meltdown.” Project Zero described these in technical detail, the Google Security blog described how we’re protecting users across all Google products, and we explained how we’re protecting Google Cloud customers and provided guidance on security best practices for customers who use their own operating systems with Google Cloud services.

Surprisingly, these vulnerabilities have been present in most computers for nearly 20 years. Because the vulnerabilities exploit features that are foundational to most modern CPUs—and were previously believed to be secure—they weren’t just hard to find, they were even harder to fix. For months, hundreds of engineers across Google and other companies worked continuously to understand these new vulnerabilities and find mitigations for them.

In September, we began deploying solutions for both Variants 1 and 3 to the production infrastructure that underpins all Google products—from Cloud services to Gmail, Search and Drive—and more-refined solutions in October. Thanks to extensive performance tuning work, these protections caused no perceptible impact in our cloud and required no customer downtime in part due to Google Cloud Platform’s Live Migration technology. No GCP customer or internal team has reported any performance degradation.

While those solutions addressed Variants 1 and 3, it was clear from the outset that Variant 2 was going to be much harder to mitigate. For several months, it appeared that disabling the vulnerable CPU features would be the only option for protecting all our workloads against Variant 2. While that was certain to work, it would also disable key performance-boosting CPU features, thus slowing down applications considerably.

Not only did we see considerable slowdowns for many applications, we also noticed inconsistent performance, since the speed of one application could be impacted by the behavior of other applications running on the same core. Rolling out these mitigations would have negatively impacted many customers.

With the performance characteristics uncertain, we started looking for a “moonshot”—a way to mitigate Variant 2 without hardware support. Finally, inspiration struck in the form of “Retpoline”—a novel software binary modification technique that prevents branch-target-injection, created by Paul Turner, a software engineer who is part of our Technical Infrastructure group. With Retpoline, we didn't need to disable speculative execution or other hardware features. Instead, this solution modifies programs to ensure that execution cannot be influenced by an attacker.

With Retpoline, we could protect our infrastructure at compile-time, with no source-code modifications. Furthermore, testing this feature, particularly when combined with optimizations such as software branch prediction hints, demonstrated that this protection came with almost no performance loss.

We immediately began deploying this solution across our infrastructure. In addition to sharing the technique with industry partners upon its creation, we open-sourced our compiler implementation in the interest of protecting all users.

By December, all Google Cloud Platform (GCP) services had protections in place for all known variants of the vulnerability. During the entire update process, nobody noticed: we received no customer support tickets related to the updates. This confirmed our internal assessment that in real-world use, the performance-optimized updates Google deployed do not have a material effect on workloads.

We believe that Retpoline-based protection is the best-performing solution for Variant 2 on current hardware. Retpoline fully protects against Variant 2 without impacting customer performance on all of our platforms. In sharing our research publicly, we hope that this can be universally deployed to improve the cloud experience industry-wide.

This set of vulnerabilities was perhaps the most challenging and hardest to fix in a decade, requiring changes to many layers of the software stack. It also required broad industry collaboration since the scope of the vulnerabilities was so widespread. Because of the extreme circumstances of extensive impact and the complexity involved in developing fixes, the response to this issue has been one of the few times that Project Zero made an exception to its 90-day disclosure policy.

While these vulnerabilities represent a new class of attack, they're just a few among the many different types of threats our infrastructure is designed to defend against every day. Our infrastructure includes mitigations by design and defense-in-depth, and we’re committed to ongoing research and contributions to the security community and to protecting our customers as new vulnerabilities are discovered.

Source: Google Cloud


Protecting our Google Cloud customers from new vulnerabilities without impacting performance

If you’ve been keeping up on the latest tech news, you’ve undoubtedly heard about the CPU security flaw that Google’s Project Zero disclosed last Wednesday. On Friday, we answered some of your questions and detailed how we are protecting Cloud customers. Today, we’d like to go into even more detail on how we’ve protected Google Cloud products against these speculative execution vulnerabilities, and what we did to make sure our Google Cloud customers saw minimal performance impact from these mitigations.

Modern CPUs and operating systems protect programs and users by putting a “wall" around them so that one application, or user, can’t read what’s stored in another application’s memory. These boundaries are enforced by the CPU.

But as we disclosed last week, Project Zero discovered techniques that can circumvent these protections in some cases, allowing one application to read the private memory of another, potentially exposing sensitive information.

The vulnerabilities come in three variants, each of which must be protected against individually. Variant 1 and Variant 2 have also been referred to as “Spectre.” Variant 3 has been referred to as “Meltdown.” Project Zero described these in technical detail, the Google Security blog described how we’re protecting users across all Google products, and we explained how we’re protecting Google Cloud customers and provided guidance on security best practices for customers who use their own operating systems with Google Cloud services.

Surprisingly, these vulnerabilities have been present in most computers for nearly 20 years. Because the vulnerabilities exploit features that are foundational to most modern CPUs—and were previously believed to be secure—they weren’t just hard to find, they were even harder to fix. For months, hundreds of engineers across Google and other companies worked continuously to understand these new vulnerabilities and find mitigations for them.

In September, we began deploying solutions for both Variants 1 and 3 to the production infrastructure that underpins all Google products—from Cloud services to Gmail, Search and Drive—and more-refined solutions in October. Thanks to extensive performance tuning work, these protections caused no perceptible impact in our cloud and required no customer downtime in part due to Google Cloud Platform’s Live Migration technology. No GCP customer or internal team has reported any performance degradation.

While those solutions addressed Variants 1 and 3, it was clear from the outset that Variant 2 was going to be much harder to mitigate. For several months, it appeared that disabling the vulnerable CPU features would be the only option for protecting all our workloads against Variant 2. While that was certain to work, it would also disable key performance-boosting CPU features, thus slowing down applications considerably.

Not only did we see considerable slowdowns for many applications, we also noticed inconsistent performance, since the speed of one application could be impacted by the behavior of other applications running on the same core. Rolling out these mitigations would have negatively impacted many customers.

With the performance characteristics uncertain, we started looking for a “moonshot”—a way to mitigate Variant 2 without hardware support. Finally, inspiration struck in the form of “Retpoline”—a novel software binary modification technique that prevents branch-target-injection, created by Paul Turner, a software engineer who is part of our Technical Infrastructure group. With Retpoline, we didn't need to disable speculative execution or other hardware features. Instead, this solution modifies programs to ensure that execution cannot be influenced by an attacker.

With Retpoline, we could protect our infrastructure at compile-time, with no source-code modifications. Furthermore, testing this feature, particularly when combined with optimizations such as software branch prediction hints, demonstrated that this protection came with almost no performance loss.

We immediately began deploying this solution across our infrastructure. In addition to sharing the technique with industry partners upon its creation, we open-sourced our compiler implementation in the interest of protecting all users.

By December, all Google Cloud Platform (GCP) services had protections in place for all known variants of the vulnerability. During the entire update process, nobody noticed: we received no customer support tickets related to the updates. This confirmed our internal assessment that in real-world use, the performance-optimized updates Google deployed do not have a material effect on workloads.

We believe that Retpoline-based protection is the best-performing solution for Variant 2 on current hardware. Retpoline fully protects against Variant 2 without impacting customer performance on all of our platforms. In sharing our research publicly, we hope that this can be universally deployed to improve the cloud experience industry-wide.

This set of vulnerabilities was perhaps the most challenging and hardest to fix in a decade, requiring changes to many layers of the software stack. It also required broad industry collaboration since the scope of the vulnerabilities was so widespread. Because of the extreme circumstances of extensive impact and the complexity involved in developing fixes, the response to this issue has been one of the few times that Project Zero made an exception to its 90-day disclosure policy.

While these vulnerabilities represent a new class of attack, they're just a few among the many different types of threats our infrastructure is designed to defend against every day. Our infrastructure includes mitigations by design and defense-in-depth, and we’re committed to ongoing research and contributions to the security community and to protecting our customers as new vulnerabilities are discovered.

What Google Cloud, G Suite and Chrome customers need to know about the industry-wide CPU vulnerability

Last year, Google’s Project Zero security team discovered a vulnerability affecting modern microprocessors. Since then, Google engineering teams have been working to protect our customers from the vulnerability across the entire suite of Google products, including Google Cloud Platform (GCP), G Suite applications, and the Google Chrome and Chrome OS products. We also collaborated with hardware and software manufacturers across the industry to help protect their users and the broader web.

All G Suite applications have already been updated to prevent all known attack vectors. G Suite customers and users do not need to take any action to be protected from the vulnerability.

GCP has already been updated to prevent all known vulnerabilities. Google Cloud is architected in a manner that enables us to update the environment while providing operational continuity for our customers. We used our VM Live Migration technology to perform the updates with no user impact, no forced maintenance windows and no required restarts.

Customers who use their own operating systems with GCP services may need to apply additional updates to their images; please refer to the GCP section of the Google Security blog post concerning this vulnerability for additional details. As more updates become available, they will be tracked on the the Compute Engine Security Bulletins page.

Finally, customers using Chrome browser—including for G Suite or GCP—can take advantage of Site Isolation as an additional hardening feature across desktop platforms, including Chrome OS. Customers can turn on Site Isolation for a specific set of websites, or all websites.

The Google Security blog includes more detailed information about this vulnerability and mitigations across all Google products.  

Source: Google Cloud


What Google Cloud, G Suite and Chrome customers need to know about the industry-wide CPU vulnerability

Last year, Google’s Project Zero security team discovered a vulnerability affecting modern microprocessors. Since then, Google engineering teams have been working to protect our customers from the vulnerability across the entire suite of Google products, including Google Cloud Platform (GCP), G Suite applications, and the Google Chrome and Chrome OS products. We also collaborated with hardware and software manufacturers across the industry to help protect their users and the broader web.

All G Suite applications have already been updated to prevent all known attack vectors. G Suite customers and users do not need to take any action to be protected from the vulnerability.

GCP has already been updated to prevent all known vulnerabilities. Google Cloud is architected in a manner that enables us to update the environment while providing operational continuity for our customers. We used our VM Live Migration technology to perform the updates with no user impact, no forced maintenance windows and no required restarts.

Customers who use their own operating systems with GCP services may need to apply additional updates to their images; please refer to the GCP section of the Google Security blog post concerning this vulnerability for additional details. As more updates become available, they will be tracked on the the Compute Engine Security Bulletins page.

Finally, customers using Chrome browser—including for G Suite or GCP—can take advantage of Site Isolation as an additional hardening feature across desktop platforms, including Chrome OS. Customers can turn on Site Isolation for a specific set of websites, or all websites.

The Google Security blog includes more detailed information about this vulnerability and mitigations across all Google products.