Tag Archives: database

El Carro extends the flexibility and choices for Oracle databases on Kubernetes

When we released El Carro, our goal was to provide the best experience possible to run Oracle databases on Kubernetes with the help of our operator. Today, we want to take a closer look at how that works. The diagram below shows the high-level architecture of a database that is managed by El Carro. At the core is the actual database instance with its background processes which run in a single container that contains the Oracle installation. So how does this container image get created and what goes into it? The image itself is essentially a snapshot of a filesystem that contains an operating system, packages and other software, and custom scripts. Specifically for El Carro, an image is made up of a base OS, required packages, and an Oracle database installation. The image must be stored on a container registry that is accessible by the Kubernetes cluster, and El Carro will expect oracle binaries to be installed in certain paths—or create symbolic links to those locations.

Architecture Diagram showing the operator controlling the db container.

Initially, El Carro worked with 12c for Enterprise Edition and 18c for Express Edition. And while 12c is still popular with many users, the extended support ended this summer. So the first news is that we added support for 19c, Oracle’s long term release. The choice should be easy for any new database deployments, but the options don’t end there.

We know that DBAs have different preferences in how and where software gets installed and we believe that making different options available will ultimately empower users. With the exception of Express Edition, redistributing is not a right granted by Oracle licenses, preventing the community from providing a public container registry with usable images. Rather than that, each user will have to build their own image based on binaries they download from Oracle themselves, using their own license agreement with Oracle. All of the other containers used by the El Carro operator use open source software and are made available on our public registry, so that you do not have to build and host them yourself.

Option 1 - Use El Carro to build your own image with GCP

If you are using GCP, then we have an easy way for you to create custom images, you just upload Oracle binaries and patches to your own GCS bucket and start a Cloud Build job that will create the container image for you and upload it to your own, private container registry. A single build script and serverless cloud services take care of the whole process, so that you don’t have to worry about building locally and moving more images across the internet. In addition to creating seeded images (see below), this method also allows you to build containers with the Oracle Patches such as Release Update Revisions (RURs).

Diagram of container image build pipeline where a cloud build job reads installation files from GCS and writes finished images to GCR.

Option 2 - Use El Carro to build you own image locally

You can also use the same Dockerfile and build process from Option 1—but without Google Cloud. Download Oracle installers and patches locally or to a VM used for the builds—then start a script that invokes Docker and builds the image on that machine. Lastly, tag and push the container image to a container registry of your choice. You will have to do a few more steps yourself if you don’t use Cloud Build, but you get the same image and customization options as with Option 1.

Option 3 - Use Oracle build scripts to build your own image

Oracle also maintains an open source repository of scripts to build container images with their database. Maybe you are already using those images either with docker or Kubernetes, or you prefer to use Oracle’s own build method over ours. We recently added functionality to El Carro to make sure that the resulting images work just as well as the ones that El Carro can build for you.

Option 4 - Use Oracle’s Container Registry directly

There is a way to avoid building your own images: The Oracle Container Registry contains pre-built images that can be used with our Kubernetes operator directly and without modification. But since Oracle’s registry can only be accessed by customers, it is protected with a password. After accepting Oracle’s license conditions, one can either copy images to their own registry, or configure OCR as a private repository in Kubernetes.

The Power of Seeding

Aside from the installation, it is the creation of a database that takes the longest time in the initial provisioning process and it is often a frustrating wait time before you can log in and use your database for the first time after creation. To reduce this wait time, the first two options allow you to build a pre-seeded database image that already contains a snapshot of a created and configured database. That way this initialization step is moved to the container build process and minimizes the startup time of new database instances.

Aside from the wait time, relying on a seeded image (i.e. including an empty database in the image can provide consistency in config options if the same image is to be used in multiple deployments).

Option 1 - El Carro on GCP

Option 2 - El Carro local build

Option 3 - Oracle local build

Option 4 - Oracle Container Registry

Versions

12c, 18c, 19c

12c, 18c, 19c

12c, 18c, 19c

19c

Editions

XE, EE

XE, EE

EE

EE

Patches Updates

yes

yes

no

no

Seeded Images

yes

yes

no

no

Automatic build pipeline

yes

no

no

n/a

Conclusion

We believe in an open cloud approach and empowering users with choice and flexibility. In the context of running Oracle databases on Kubernetes that means that you get to choose your database container images. El Carro provides build scripts that allow you to not only customize containers but also to increase security and robustness with the ability to bake patches and updates into the container image. Seeding container images with a database further reduces the deployment time by avoiding this step on first startup - which is especially useful in environments that create many databases - such as automatic test pipelines.

But other users may feel more comfortable in receiving support when they use Oracle’s pre-built images from their registry.

The choice is yours. Just know that El Carro is here to help you modernize your Oracle database workloads with Kubernetes. And if you have any other feature requests or choices that matter to you—let us know by filing an issue on Github.

By Bjoern Rost, Product Manager and Ash Gbadamassi, Software Engineer – Cloud Databases

MySQL to Cloud Spanner via HarbourBridge



Today we’re announcing that HarbourBridge—an open source toolkit that automates much of the manual work of evaluating and assessing Cloud Spanner—supports migrations from MySQL, in addition to existing support for PostgreSQL. This provides a zero-configuration path for MySQL users to try out Cloud Spanner. HarbourBridge bootstraps early stages of migration, and helps get you to the meaty issues as quickly as possible.

Core capabilities

At its core, HarbourBridge provides an automated workflow for loading the contents of an existing MySQL or PostgreSQL database into Spanner. It requires zero configuration—no manifests or data maps to write. Instead, it imports the source database, builds a Spanner schema, creates a new Spanner database populated with data from the source database, and generates a detailed assessment report. HarbourBridge can either import dump files (from mysqldump or pg_dump) or directly connect to the source database. It is intended for loading databases up to a few tens of GB for evaluation purposes, not full-scale migrations.

Bootstrap early-stage migration

HarbourBridge bootstraps early-stage migration to Spanner by using an existing MySQL or PostgreSQL source database to quickly get you running on Spanner. It generates an assessment report with an overall migration-fitness score for Spanner, a table-by-table analysis of type mappings and a list of features used in the source database that aren't supported by Spanner.

View HarbourBridge as a way to get up and running quickly, so you can focus on critical things like tuning performance and getting the most out of Spanner. You will need to tweak and enhance what HarbourBridge produces—more on that later.

Getting started

HarbourBridge can be used with the Cloud Spanner Emulator, or directly with a Cloud Spanner instance. The Emulator is a local, in-memory emulation of Spanner that implements the same APIs as Cloud Spanner’s production service, and allows you to try out Spanner’s functionality without creating a GCP Project. The HarbourBridge README contains a step-by-step quick-start guide for using the tool with a Cloud Spanner instance.

Together, HarbourBridge and the Cloud Spanner Emulator provide a lightweight, open source toolchain to experiment with Cloud Spanner. Moreover, when you want to proceed to performance testing and tuning, switching to a production Cloud Spanner instance is a simple configuration change.

To get started on using HarbourBridge with the Emulator, follow the Emulator instructions. In particular, start the Emulator using Docker and configure the SPANNER_EMULATOR_HOST environment variable (this tells the Cloud Spanner Client libraries to use the Emulator).

Next, install Go and configure the GOPATH environment variable if they are not already part of your environment. Now you can download and install HarbourBridge using
 
GO111MODULE=on \
go get github.com/cloudspannerecosystem/harbourbridge

It should be installed as $GOPATH/bin/harbourbridge. To use HarbourBridge on a MySQL database, run mysqldump and pipe its output to HarbourBridge

mysqldump <opts> db | $GOPATH/bin/harbourbridge -driver=mysqldump

where <opts> are the standard options you pass to mysqldump or mysql to specify host, port, etc., and db is the name of the database to dump.
Similarly, to use HarbourBridge on a PostgreSQL database, run

 pg_dump <opts> db | $GOPATH/bin/harbourbridge -driver=pg_dump

See the Troubleshooting guide if you run into any issues. In addition to creating a new Spanner database with data from the source database, HarbourBridge also generates a schema file, the assessment report, and a bad data file (if any data is dropped). See Files generated by HarbourBridge.

Sample dump files

If you don’t have ready access to a MySQL or PostgreSQL database, the HarbourBridge github repository has some samples. The files cart.mysqldump and cart.pg_dump contain mysqldump and pg_dump output for a very basic shopping cart application (just two tables, one for products and one for user carts). The files singers.mysqldump and singers.pg_dump contain mysqldump and pg_dump output for a version of the Cloud Spanner singers example. To use HarbourBridge on cart.mysqldump, download the file locally and run

$GOPATH/bin/harbourbridge -driver=mysqldump < cart.mysqldump

Next steps

The schema created by HarbourBridge provides a starting point for evaluation of Spanner. While it preserves much of the core structure of your MySQL or PostgreSQL schema, data types will be mapped based on the types supported by Spanner, and unsupported features will be dropped e.g. functions, sequences, procedures, triggers and views. See the assessment report as well as HarbourBridge’s Schema conversion documentation for details.

To test Spanner’s performance, you will need to switch from the Emulator to a Cloud Spanner instance. The HarbourBridge quick-start guide provides details of how to set up a Cloud Spanner instance. To have HarbourBridge use your Cloud Spanner instance instead of the Emulator, simply unset the SPANNER_EMULATOR_HOST environment variable (see the Emulator documentation for context).

To optimize your Spanner performance, carefully review choices of primary keys and indexes—see Keys and indexes. Note that HarbourBridge preserves primary keys from the source database but drops all other indexes. This means that the out-of-the-box performance you get from the schema created by HarbourBridge can be significantly impacted. If this is the case, add appropriate Secondary indexes. In addition, consider using Interleaved tables to optimize table layout and improve the performance of joins.

Recap

HarbourBridge is an open source toolkit for evaluating and assessing Cloud Spanner using an existing MySQL or PostgreSQL database. It automates many of the manual steps so that you can quickly get to important design, evaluation and performance issues, such as. refining choice of primary keys, tuning of indexes, and other optimizations.

We encourage you to try out HarbourBridge, send feedback, file issues, fork and modify the codebase, and send PRs for fixes and new functionality. We have big plans for HarbourBridge, including the addition of user-guided schema conversion (to customize type mappings and provide a guided exploration of indexing, primary key choices, and use of interleaved tables), as well as support for more databases. HarbourBridge is part of the Cloud Spanner Ecosystem, owned and maintained by the Cloud Spanner user community. It is not officially supported by Google as part of Cloud Spanner.

By Nevin Heintze, Cloud Spanner

Introducing Cloud Firestore: Our New Document Database for Apps

Originally posted by Alex Dufetel on the Firebase Blog

Today we're excited to launch Cloud Firestore, a fully-managed NoSQL document database for mobile and web app development. It's designed to easily store and sync app data at global scale, and it's now available in beta.

Key features of Cloud Firestore include:

  • Documents and collections with powerful querying
  • iOS, Android, and Web SDKs with offline data access
  • Real-time data synchronization
  • Automatic, multi-region data replication with strong consistency
  • Node, Python, Go, and Java server SDKs

And of course, we've aimed for the simplicity and ease-of-use that is always top priority for Firebase, while still making sure that Cloud Firestore can scale to power even the largest apps.

Optimized for app development

Managing app data is still hard; you have to scale servers, handle intermittent connectivity, and deliver data with low latency.

We've optimized Cloud Firestore for app development, so you can focus on delivering value to your users and shipping better apps, faster. Cloud Firestore:

  • Synchronizes data between devices in real-time. Our Android, iOS, and Javascript SDKs sync your app data almost instantly. This makes it incredibly easy to build reactive apps, automatically sync data across devices, and build powerful collaborative features -- and if you don't need real-time sync, one-time reads are a first-class feature.
  • Uses collections and documents to structure and query data. This data model is familiar and intuitive for many developers. It also allows for expressive queries. Queries scale with the size of your result set, not the size of your data set, so you'll get the same performance fetching 1 result from a set of 100, or 100,000,000.
  • Enables offline data access via a powerful, on-device database. This local database means your app will function smoothly, even when your users lose connectivity. This offline mode is available on Web, iOS and Android.
  • Enables serverless development. Cloud Firestore's client-side SDKs take care of the complex authentication and networking code you'd normally need to write yourself. Then, on the backend, we provide a powerful set of security rules so you can control access to your data. Security rules let you control which users can access which documents, and let you apply complex validation logic to your data as well. Combined, these features allow your mobile app to connect directly to your database.
  • Integrates with the rest of the Firebase platform. You can easily configure Cloud Functions to run custom code whenever data is written, and our SDKs automatically integrate with Firebase Authentication, to help you get started quickly.

Putting the 'Cloud' in Cloud Firestore

As you may have guessed from the name, Cloud Firestore was built in close collaboration with the Google Cloud Platform team.

This means it's a fully managed product, built from the ground up to automatically scale. Cloud Firestore is a multi-region replicated database that ensures once data is committed, it's durable even in the face of unexpected disasters. Not only that, but despite being a distributed database, it's also strongly consistent, removing tricky edge cases to make building apps easier regardless of scale.

It also means that delivering a great server-side experience for backend developers is a top priority. We're launching SDKs for Java, Go, Python, and Node.js today, with more languages coming in the future.

Another database?

Over the last 3 years Firebase has grown to become Google's app development platform; it now has 16 products to build and grow your app. If you've used Firebase before, you know we already offer a database, the Firebase Realtime Database, which helps with some of the challenges listed above.

The Firebase Realtime Database, with its client SDKs and real-time capabilities, is all about making app development faster and easier. Since its launch, it has been adopted by hundred of thousands of developers, and as its adoption grew, so did usage patterns. Developers began using the Realtime Database for more complex data and to build bigger apps, pushing the limits of the JSON data model and the performance of the database at scale. Cloud Firestore is inspired by what developers love most about the Firebase Realtime Database while also addressing its key limitations like data structuring, querying, and scaling.

So, if you're a Firebase Realtime Database user today, we think you'll love Cloud Firestore. However, this does not mean that Cloud Firestore is a drop-in replacement for the Firebase Realtime Database. For some use cases, it may make sense to use the Realtime Database to optimize for cost and latency, and it's also easy to use both databases together. You can read a more in-depth comparison between the two databases here.

We're continuing development on both databases and they'll both be available in our console and documentation.

Get started!

Cloud Firestore enters public beta starting today. If you're comfortable using a beta product you should give it a spin on your next project! Here are some of the companies and startups who are already building with Cloud Firestore:

Get started by visiting the database tab in your Firebase console. For more details, see the documentation, pricing, code samples, performance limitations during beta, and view our open source iOS and JavaScript SDKs on GitHub.

We can't wait to see what you build and hear what you think of Cloud Firestore!

JanusGraph connects the past and future of Titan

We are thrilled to collaborate with a group of individuals and companies, including Expero, GRAKN.AI, Hortonworks and IBM, in launching a new project — JanusGraph — under The Linux Foundation to advance the state-of-the-art in distributed graph computation.



JanusGraph is a fork of the popular open source project Titan, originally released in 2012 by Aurelius, and subsequently acquired by DataStax. Titan has been widely adopted for large-scale distributed graph computation and many users have contributed to its ongoing development, which has slowed down as of late: there have been no Titan releases since the 1.0 release in September 2015, and the repository has seen no updates since June 2016.

This new project will reinvigorate development of the distributed graph system to add new functionality, improve performance and scalability, and maintain a variety of storage backends.

The name "Janus" comes from the name of a Roman god who looks simultaneously into the past to the Titans (divine beings from Greek mythology) as well as into the future.

All are welcome to participate in the JanusGraph project, whether by contributing features or bug fixes, filing feature requests and bugs, improving the documentation or helping shape the product roadmap through feature requests and use cases.

Get involved by taking a look at our website and browse the code on GitHub.

We look forward to hearing from you!

By Misha Brukman, Google Cloud Platform