Is multi-cloud a pipe dream? I think not!
From startups to enterprises, despite material increases in efficiency and the price to performance ratio of the compute, network and storage resources we all use, infrastructure continues to come at substantial cost. It can also be a real risk driver; each implementation choice affects future scalability, service level and flexibility of the services being built. It’s fair to say that “future-proofing” should be the primary concern of every system architect.
Providers of infrastructure aren’t disinterested actors either; there are huge incentives for any vendor to increase lock-in through contractual, fiscal and technical constrictions. In many cases, interest in cloud infrastructure, particularly existing consumers of infrastructure, has been driven by a huge urge to break free of existing enterprise vendor relationships for which the lock-in costs are higher than the value provided. Once they have some kind of lock-in working, infrastructure companies know that they can charge higher rents without necessarily earning them.
A good first step is to actively resist lock-in mechanisms. Most consumers have figured out that long-term contractual commitments can be dangerous. Most have figured out that pre-paid arrangements distort decisionmaking and can be dangerous. Technical lock-in remains one of the most difficult to avoid. Many providers wrap valuable differentiated services in proprietary APIs so that applications eventually get molded around their design. These “sticky services” or “loss leaders” create substantial incentives for tech shops to take the shorter path to value and accept a bit of lock-in risk. This is a prevalent form of technical debt, especially when new vendors release even more powerful and differentiated tools in the same space, or when superior solutions rise out of OSS communities.
In the past, some companies tried to help users get out from under this debt by building abstraction layers on top of the proprietary APIs from each provider, so that users could use one tool to broker across multiple clouds. This approach has been messy and fragile, and tends to compromise to the lowest-common denominator across clouds. It also invites strategic disruption from cloud providers in order to preserve customer lock-in.
Open architecturesThankfully, this isn’t the only way technology works. It’s entirely possible to efficiently build scaled, high performance, cost-efficient systems without accepting unnecessary technical lock-in risk or tolerating the lowest-common denominator. You can even still consume proprietary infrastructure products, as long as you can prove to yourself that because those products expose open APIs, you can move when you want to. This is not to say that this isn’t complex, advanced work. It is. But the amount of time and effort required is shrinking radically every day. This gives users leverage; as your freedom goes up, it becomes easier and easier to treat providers like the commodities they ought to be.
We understand the value of proprietary engineering. We’ve created a purpose built cloud stack, highly tuned for scale, performance, security, and flexibility. We extract real value from this investment, through our advertising, applications as well as our cloud businesses. But GCP, along with some other providers and members of the broader technology community, recognize that when users have power, they can do powerful things. We’ve worked hard to deliver services that are differentiated by their performance, stability and cost, but not by proprietary, closed APIs. We know this means that you can stop using us when you want to; we think that gives you the power to use us at lower risk. Some awesome folks have started calling this approach GIFEE or “Google Infrastructure For Everyone Else. But given the overwhelming participation and source code contributions — including those for kubernetes — from individuals and companies of all sizes to the OSS projects involved, it’s probably more accurate to call it Everyone’s Infrastructure, For Every Cloud — unfortunately that’s a terrible acronym.
|Links: AppScale, Kubernetes, Apache HBase, Druid, Apache Drill, Vitess, Apache Beam, TensorFlow, Minio.io, Spinnaker|
A few salient examples:
Applications can run in containers on Kubernetes, the OSS container orchestrator that Google helped create, either managed and hosted by us via GKE, or on any provider, or both at the same time.
Kubernetes ensures that your containers aren’t locked in.
Web apps can run in a PaaS environment like AppScale, the OSS application management framework, either managed and hosted by us via Google AppEngine, or on any provider, or both at the same time. Importantly this includes the NoSQL transactional stores required by apps, either powered by AppScale, which uses Cassandra as a storage layer and vends the Google App Engine Datastore API to applications, or native in App Engine.
AppScale ensures that your apps aren’t locked in.
NoSQL k-v stores can run Apache HBase, the OSS NoSQL engine inspired by our Bigtable whitepaper, either managed and hosted by us via Cloud Bigtable, or on any other provider, or both at the same time.
HBase ensures that your NoSQL isn’t locked in.
OLAP systems can be run using Druid or Drill, two OSS OLAP engines inspired by Google’s Dremel system. These are very similar to BigQuery, and allow you to run on any infrastructure.
Druid and Drill ensure that your OLAP system isn’t locked in.
Advanced RDBMS can be built in Vitess, the OSS MySQL toolkit we helped create, either hosted by us inside Google Container Engine, or on any provider via Kubernetes, or both at the same time. You can also run MySQL fully managed on GCP via CloudSQL.
Vitess ensures that your relational database isn’t locked in.
Data orchestration can run in Apache Beam, the OSS ETL engine we helped create, either managed and hosted by us via Cloud Dataflow, or on any provider, or both at the same time.
Beam ensures that your data ETL isn’t locked in.
Machine Learning can be built in TensorFlow, the OSS ML toolkit we helped create, either managed and hosted by us via CloudML, or on any provider, or both at the same time.
TensorFlow ensures that your ML isn’t locked in.
Object storage can be built in Minio.io which vends the S3 API via OSS, either managed and hosted by us via GCS which also emulates the S3 API, or on any provider, or both at the same time.
Minio ensures that your object store isn’t locked in.
Continuous Deployment tooling can be delivered using Spinnaker, a project started by the Netflix|OSS team, either hosted by us via GKE, or on other providers, or both at the same time.
Spinnaker ensures that your CD tooling isn’t locked in.
What’s still proprietary, but probably OK?
CDN, DNS, Load BalancingBecause the interfaces to these kinds of services are network configurations rather than code, so far these have remained proprietary across providers. NGINX and Varnish make excellent OSS load balancers/front-end caches, but because of the low friction low risk switchability, there’s no real need to avoid DNS or LB services on public clouds.
File SystemsThese are still pretty hard for cloud providers to deliver as managed services at scale; Gluster FS, Avere, ZFS and others are really useful to deliver your own POSIX layers irrespective of environment. If you’re building inside Kubernetes, take a look at CoreOS’s Torus project.
It’s not just software, it’s the data
Lock-in risk comes in many forms, one of the most powerful being data gravity or data inertia. Even if all of your software can move between infrastructures with ease, those systems are connected by a limited, throughput constrained internet and once you have a petabyte written down, it can be a pain in the neck to move. What good is software you can move in a minute if it takes a month to move the bytes?
There are lots of tools that help, both native from GCP, and from our growing partner ecosystem.
- If your data is in an object store, look no further than the Google Storage Transfer Service, an easy automated tool for moving your bits from A to G.
- If you have data on tape or disk, take a look at the Offline Media Import/Export service. You might need to regularly move data to and from our cloud, so take a look at Google Cloud Interconnect for leveraging carriers or public peering points to connect reliably with us.
- If you have VM images you’d like to move to cloud quickly we recommend Cloud Endure to move and transform your images for running on Google Compute Engine.
- If you have a database you need replicated, take a look at Attunity CloudBeam. If you’re trying to migrate bulk data, try FDT from CERN.
- If you’re doing data imports, perhaps Embulk.
ConclusionWe hope the above helps you choose open APIs and technologies designed to help you grow without locking you in. That said, remember that the real proof you have the freedom to move is to actually move; try it! Customers have told us about their new-found power at the negotiating table when they can demonstrably run their application across multiple providers.
All of the above mentioned tools, in combination with strong private networking between providers, allow your applications to span providers with a minimum of provider-specific implementation detail.
If you have questions about how to implement the above, about other parts of the stack this kind of thinking applies to, or about how you can get started, don’t hesitate to reach out to us at Google Cloud Platform, we’re eager to help.