One question we often get from prospective customers is “Can you migrate my existing Cassandra cluster to Instaclustr without any downtime?”. The answer to that is, of course, “YES!”. We have done this on numerous occasions now. This blog gives a brief walk-through of the process we follow to achieve that migration.
This procedure also illustrates some of the great manageability advantages that are available when you combine Cassandra’s native multi-data center capabilities with automated provisioning in the cloud. We use a procedure very similar to this for operations such as splitting out column families from an individual keyspace to multiple clusters (useful where the column families have very different target performance capabilities) and migrating clusters on to newer generations of hardware.
The process and what you can expect
At a high-level, the process we follow is to set up the new, Instaclustr-managed, cluster as a second data-center in the existing cluster, synchronise the data to the new cluster, cut over the app to the new cluster and then cut the link to the old cluster. This approach means that:
- Cassandra services will be 100% available at all stages during migration;
- If your app supports changing connection settings while remaining online, your app can remain 100% available during the migration.
- Up until the point where we cut connection to the old cluster, there is a very quick roll-back strategy to the existing configuration. The app is up and running on the new cluster before we cross this point.
Detailed steps for zero downtime migration
In a bit more detail, the steps we perform to complete this migration are:
- Prepare the existing environment: ensure that the app is using a DC aware load balancing policy and LOCAL_*. Ensure that all keyspaces to be copied to the new cluster are using the NetworkTopologyStrategy replication strategy (if not currently set up this way, changing might be tricky – we recommend using this strategy for all keyspaces when you create them).
- Create the new cluster: Create a new cluster using the Instaclustr console. Ensure the Cassandra version and cluster name are the same as the existing cluster and the data center name is different from the existing data center.
- Join the clusters: we will make necessary firewall rule changes (some may also be required on the source cluster side) and then change seed nodes in the new cluster and start the nodes so the new cluster becomes a second data center in the existing cluster.
- Change replication settings: In the existing cluster, the replication settings for keyspaces to be copied will be updated to specify that data should be replicated to the newly added data center.
- Copy data to the new cluster: once the cluster are joined, Cassandra will start replicating writes to the new cluster. However, existing data will need to be copied across to the existing cluster by using the nodetool rebuild command. We will typically execute this one or two nodes at a time on the new cluster to limit the streaming load on the existing cluster.
- Cutover app: Once the rebuilds are complete, both cluster will have a complete copy of the data that Cassandra is automatically keeping in synch. At this point, the initial connection points for the application can be changed from nodes in the old cluster to node is the new cluster. The new cluster will serve all reads and writes will go to the new cluster first before being replicated to the old cluster. At this point we will typically run a repair operation across the cluster to fully verify that all data from the existing cluster is successfully replicated.
- Decommission existing cluster: Connectivity from the existing cluster to the existing cluster is severed by changing firewall rules, replication settings are updated in the new cluster to stop replicating data to the old cluster and the old cluster can be shut down.
This process might seem complex but don’t worry – that’s what Instaclustr’s expert tech ops staff are here for, email us at firstname.lastname@example.org. We have run through this process many times and will carry out much of the work and provide you with detailed instructions and support where actions are required on your side to complete the process.