Instaclustr has a number of internal tools and procedures that help us keep Cassandra clusters healthy. One of those tools allows us to replace the instance backing a Cassandra node while keeping the IP’s and data. Dynamic Resizing released in June 2017 uses technology developed for that tool to allow customers to scale Cassandra clusters vertically based on demand.
Initially, the replace tool operated by detaching the volumes from the instance being replaced, then re-attaching the volumes to the new instance. This limited the usage of the tool to EBS-backed instances. Another often-requested extension to the tool was resizing a Data Centre to a different node size to upgrade to a newly added node size, or to switch over to the resizable class nodes.
One option for changing instance size where we could not just detach and reattach data volumes was to use Cassandra’s native node replace functionality to replace each instance in the cluster in a rolling fashion. At first, this approach seems attractive and can be conducted with zero downtime. However, quite some time ago we realised that, unless you run a repair between each replacement, this approach has almost certain loss of a small amount of data when any replace operation exceeds the hinted hand-off window. As a result, we relied on fairly tedious and complex methods of rolling upgrades involving attaching and re-attaching EBS volumes.
To address this problem, we have recently extended the replace tool to remove these limitations and support the complete rolling replace use case. The new “copy data” replace mode replaces a node in the following stages:
- Provision the new node of the desired size
- Copy most of the data from the old node to the new node
- Stop the old node
- Reallocate IPs
- Join the replacement node to the cluster
Provisioning is trivial with our powerful provisioning system, but copying large amounts from a live node presents some specific challenges. We had to develop a solution which was able to copy large amounts of data from a live node without creating too much additional load on a cluster that might already be under stress. We also had to work carefully within constraints created by Cassandra’s hinted handoff system.
We explored a number of solutions to the problem of copying data to the new node while minimising the impact to the running nodes. After discarding several alternatives, we settled on a solution built on Instaclustr’s existing, proven backup/restore system. This ensures minimal resource strain on the node being replaced as we only need to copy the data added since the last backup was taken and most of the data is already stored in the cloud storage.
Stopping the old node ensuring that no data is lost requires stopping Cassandra and uploading the data that has been written since the previous step. This process usually completes within 10 minutes, ensuring a minimal degradation of cluster performance.
After all of the data is on the new node, the old node is terminated, its public and private IP’s are transferred to the new node, and Cassandra is started on the new node. As the replacement node joins, it receives the data that was missed during the short downtime as hinted handoffs.
The new solution has allowed us to standardise our approach to node replacement for all instance types (local and external storage) using the proven technology of our Cassandra backup system to improve the overall performance of the process. At the moment, this resize functionality is controlled by our administrators and can be requested by customers via our support channel. We will likely make the functionality available directly to users in the future.