Solution: Integrating on-premises storage with Google Cloud using an Avere vFXT



Running compute workloads on the cloud can be a powerful way to leverage the massive resources available on Google Cloud Platform (GCP).

Some workloads, such as 3D rendering or HPC simulations, rely on large datasets to complete individual tasks. This isn't a problem when running jobs on-premises, but how do you synchronize tens, or even hundreds of gigabytes of data with cloud storage in order to run the same workload on the cloud?

Even if your Network Attached Storage (NAS) is situated close enough to a Google Cloud datacenter to mount directly, you can quickly saturate your internet connection when hundreds (or even thousands) of virtual machines attempt to read the same data at the same time from your on-premises NAS.

You could implement a synchronization strategy to ensure files exist both on-premises and in the cloud, but managing data concurrency, storage and resources on your own can be challenging, as would be modifying your existing pipeline to perform these actions.

The Avere vFXT is a virtual appliance that provides a solution for such workloads. The vFXT is a cluster of virtual machines that serves as both read-through cache and POSIX-compliant storage. When you mount the vFXT on your cloud instances, your entire on-premises file structure is represented in the cloud. When files are read on the cloud, they're read from your on-premises NAS, across your secure connection, and onto the vFXT. If a file already exists in the vFXT's cache, it's compared with the on-premises version. If the files on both the cache and on-premises are identical, the file is not re-read, which can save you bandwidth and time to first byte.

As cloud instances are deployed, they mount the vFXT as they would any other filesystem (either NFS or SMB). The data is available when they need it, and your connection is spared oversaturation.
We recently helped Avere put together a partner tutorial that shows how to incorporate an Avere vFXT into your GCP project. It also provides guidance on different ways to connect to Google Cloud, and how to access your vFXT more securely and efficiently.

Check out the tutorial, and let us know what other Google Cloud tools you’d like to learn how to use in your visual effects or HPC pipeline. You can reach me on Twitter at @vfx_agraham.