Instaclustr is pleased to announce the general release of two significant enhancements to the backup and restore capabilities of Instaclustr Managed Apache Cassandra and Elassandra:
- Continuous Backup – providing near zero data loss backups; and
- User Controlled Restore – giving you the power to restore from an existing cluster’s backups to a new cluster, using the Instaclustr Console.
Together, these provide higher levels of protection against data loss, and lets developers and sysadmins use backups in new ways.
Cassandra’s native replication provides very strong protection against many types of failure for which backups are traditionally required (infrastructure failure). However, replication does not protect against data corruption either through application bugs or operator error. For this reason, it remains important to have appropriate backups of data stored in Cassandra.
Cassandra replication provides very strong protection against data loss, but does not protect against data corruption or deletion which may be caused by application bugs, operator error, or by malicious action. A reliable data backup scheme is a critical part of any disaster recovery plan.
With Continuous Backup, Instaclustr now offers two types of backup service: Snapshot Backup and Continuous Backup. Both transfer cluster data backups to cloud storage for a seven day retention period.
Our traditional backup service, and still the default, is Snapshot Backup, which performs a snapshot backup for all cluster nodes once every 24 hours.
Continuous backup can be optionally enabled for a cluster to perform backups more frequently. Enabling Continuous Backup for a cluster will increase the frequency of snapshot backups to once every three hours, and will perform continuous backup of commit logs at five minute intervals. This frequency provides a significantly reduced window of potential data loss, with near zero data loss
User Controlled Restores
In addition to the new option for low-data-loss backups, we have introduced User Controlled Restores to provide Instaclustr users with faster and more flexible data recovery and duplication options.
Cluster backups may now be used to clone an existing cluster to a specified point in time using the Instaclustr Console or the Provisioning API. Such a restored cluster could serve as a staging environment or test environment for functional or performance testing. Additional restore options include:
- Restoring all data, or only specified tables.
- Restoring to the latest backup, or to a specified point in time.
These new features enable Instaclustr users with the click of a couple of buttons or a single API call to spin up replicas of their clusters from backups. For example, we have used this capability to create a new 3 data centre, 18 node cluster loaded with data from an existing cluster in under one hour and executing a single command (recovery time for different cluster will be dependent on the volume of data per node).
This ease of restoring from backups not only helps greatly improve recovery time data recovery scenarios but allow for the easy use of backup and restore functionality in scenarios such as:
- Cloning a production database so that the clone can be used for performance testing or functional test. Optionally, the Instaclustr API could be used under a schedule to regularly refresh the test environment with production data.
- Restoring a selected table at several points in time to analyse changes in data
- Resolving a complex data corruption situation caused by an application defect or other change, where the fix involves more than a simple rollback to a point in time.
Important details about how to use Continuous Backup are available in the following articles:
Restoring a Cluster: