When you concentrate two years worth of fundraising into seven hours, every second counts. That’s the reality for Comic Relief, one of the U.K.’s most notable charities. Held every two years, Comic Relief’s Red Nose Day encourages the public to make the world a better place in the easiest way imaginable: by having a great time.
For this year’s fundraising event, Comic Relief turned to Google Cloud’s technology partner Pivotal to host its donation-processing systems. The platform also automated management of the underlying cloud infrastructure. Cloud services from Google Cloud Platform (GCP) were used to run Pivotal Cloud Foundry during Red Nose Day. In advance of the 2017 event, the charity was forecasting peaks of several hundred transactions a second for its online donation system. The stakes couldn’t have been higher.
We’re happy to report that Comic Relief raised over £73 million (and counting) for its marquee event! We caught up with David Laing, director of software engineering at Pivotal, to discuss running Pivotal Cloud Foundry on GCP for the 2017 event.
What kind of scale were you expecting for Red Nose Day?
Comic Relief does most of its two-year fundraising cycle in a seven-hour window. The donation system needed to scale with 100% uptime and reliability. It’s your classic elastic, spin-up/spin-down use case for the public cloud.
There are more than 14,000 call center reps that take donations via phone. The reps log donation details in the system. We also expected up to 100,000 concurrent web sessions, where individuals donate online. We expected nearly a million donations in all, with up to 300 donations a second.
What kind of apps did you run on Pivotal Cloud Foundry?
These were cloud-native applications, authored by consultancy Armakuni, in conjunction with Comic Relief. The apps used horizontally scalable, stateless microservices. Capturing donor information and processing their donation immediately is critical. This core availability requirement drove the architecture to have layers upon layers of redundancy. We hosted three independent shards of the full system in different datacenters spread over four countries and two continents, balancing traffic between them using DNS. Each shard then load balanced donations to multiple payment providers. Choosing availability over consistency and an “eventually consistent” architecture like this prepared us to continue to take donations in the event of multiple system failures. An async background process collected all the donation information to a central reporting shard.
What was it like working with GCP’s services?
At Pivotal, we love the performance and rapid provisioning of Compute Engine. The automated usage discounts on Google Cloud are so refreshing. You don’t need engineers to parse through consumption data to minimize your bill.
The load for Comic Relief is highly variable, with major consequences if performance suffers during traffic spikes. Unlike other clouds, GCP load balancers don't require a call to technical support to pre-warm. This saves our cloud admin's time and allows us to survive unexpected load increases. It gives us peace of mind knowing that GCP load balancers are built for scale, and backed up by the largest network of any cloud provider. In our experience, Google Cloud is able to handle traffic spikes that might stress other cloud providers.
We used Stackdriver Logging in our weekly capacity tests. We really liked its tight integration with BigQuery and Google Cloud Storage. Having the telemetry data stored in a massively scalable data analysis system helped us to analyze and pinpoint problematic areas ahead of time.
Identity management is another area where GCP shines. Since we already use G Suite for our corporate identity management, user management to all the GCP services was effortless.
How was the deployment of Pivotal Cloud Foundry on GCP?
Both Pivotal and Google have invested a lot in making Cloud Foundry and GCP work well together.
Deployment of Pivotal Cloud Foundry on Google Cloud “just worked.” From the application’s perspective, Pivotal Cloud Foundry makes GCP look identical to other clouds; making multi-cloud deployment very simple. We followed the recommended deployment architecture and our reference architecture patterns for GCP.
The only real work was in figuring out how many Compute Engine VMs were required to handle the expected traffic.
For mission-critical workloads—like this scenario with Comic Relief—multi-site availability is a common pattern. This often takes the shape of multi-cloud, as it did with Red Nose Day. What’s your guidance for organizations looking to move to this model?
Organizations need to evolve their application architectures following two key architectural patterns.
The first is to adopt a microservices architecture that breaks an application into components that are stateless and stateful. Stateless components are easy to scale and distribute; so doing as much of the “work” in these components provides flexibility. Stateful components are harder to manage; so it’s good practise to minimise these and ensure your application degrades gracefully should one of these fail or stall.
The second is to follow 12 factor app principles and build each microservice so that it can be run on an infrastructure agnostic platform like Pivotal Cloud Foundry. Pivotal Cloud Foundry abstracts away all the differences between different clouds. This makes it trivial to deploy and run the exact same application artifacts in multiple clouds.
An application architected according to the above two principles allows an organisation to wire the full stack together based on performance needs as well as organisational and governance requirements. Most importantly, you get the flexibility to change quickly as requirements change.
Comic Relief—whose donations app is architected like this—can massively scale up the application to run on multiple clouds with multiple layers of redundancy for the seven hours of the year when donations peak. For the rest of the year, they can run a single copy of the donations application in a scaled-down form to minimize costs.
Since Pivotal Cloud Foundry makes all clouds look the same, Comic Relief gets to choose the best cloud provider(s) every year. Over the past five years the app has been run in a private data center and across three public clouds—all with no changes to the application code.
What was the multi-cloud experience like for the engineering teams supporting the event?
This is where Pivotal Cloud Foundry can really help. The platform makes all infrastructure targets look the same. For Comic Relief—and everyone for that matter—Pivotal Cloud Foundry abstracts away the differences between running on-premises and running on GCP. Once the Pivotal Ops team figured out how to run Pivotal Cloud Foundry on GCP, there was basically no work involved for the app developers. They just had to target a new Pivotal Cloud Foundry endpoint and rerun cf-push to get their application running on GCP.
If something unexpected happened on Red Nose Day, the application operations team can simply remove the affected site from the DNS round-robin list. Traffic would be re-directed to the other installations while we calmly triaged the issue. Regardless, despite a potential disruption, we knew that donations would still be accepted and processed.
Want to learn more about how Pivotal and Google are collaborating? Check out the Cloud Native Roadshow in a city near you. To hear more from Comic Relief, please register for Google Cloud Next London May 3-4.