This section describes the automated Snapshot Backup service that is provided to all Instaclustr-managed clusters. It also describes the optional Continuous Backup service that allows for more frequent backups to be taken.
Instaclustr provides two backup services: Snapshot Backup and Continuous Backup. Both backup services transfer backed up data to cloud storage (e.g. S3 storage bucket for an AWS cluster) for retention for a period of 7 days.
In addition, it also describes the Instaclustr’s backup functionality on Cassandra Secondary Indexes.
Snapshot Backup is the default backup service. Under Snapshot Backup, all cluster nodes will perform a snapshot backup once every 24 hours. This involves running a nodetool snapshot operation across all keyspaces on the node and then uploading the snapshot files to cloud storage. Node snapshot timing is staggered to reduce the impact of backup operations on the overall cluster performance.
Continuous Backup can optionally be enabled to perform backups more frequently. Enabling Continuous Backup for a cluster will:
- Increase the frequency of snapshot backups to once every 3 hours
- Enable commit log backups once every 5 minutes. As each node rotates its commit logs, it will archive the log and schedule it to be copied to cloud storage. Once copied to cloud storage, the backup service removes the archives on the node.
Increased snapshot schedules combined with commit log backups provide:
- Reduced window of potential data loss (i.e. a lower restore point objective)
- Selective restores of specific tables
Like Snapshot Backup, node snapshot timing under Continuous Backup is staggered to reduce the impact of backup operations on the overall cluster performance.
Continuous Backup is enabled by selecting the Continuous Backup option when creating the cluster:
Continuous Backup can also be selected when adding new cluster data centres to an existing cluster.
If you would like to enable Continuous Backup on an existing cluster, please contact our support team.
Backups of Cassandra Secondary Index
The Instaclustr backup services back up any secondary indexes a cluster has to a cloud storage location (e.g. S3 storage bucket for an AWS cluster). The exact location and the file naming convention used for the backed up files depends on the type of secondary index and the version of Cassandra.
- Regular Secondary Index
- Cassandra 2.2 + Secondary index will be stored as sstables under a separate directory inside their respective tables. The secondary index directory is named as ‘.nameOfTheIndex’. The naming convention of sstable files is,Cassandra 2.2.x + – ‘md-#-big-*’, eg: md-1-big-Data.db
- Cassandra 2.1.x & Cassandra 2.0.xSecondary index will be stored as sstables in the same directory of their respective tables. The naming conventions of sstable files are,Cassandra 2.0.x – ‘keyspace-table.nameOfTheIndex-jb-#-*’, eg: testkeyspace-testtable.testindex-jb-1-Data.dbCassandra 2.1.x – ‘keyspace-table.nameOfTheIndex-ka-#-*’, eg: testkeyspace-testtable.testindex-ka-1-Data.db
- SASI Index (SSTable Attached Secondary Index)
Another important difference with SASI Index is that if a cluster already has SASI index before the Instaclustr backup service is started, the backup service will not backup SASI index. In such a scenario, the Cassandra service needs to be restarted. If this situation occurs on a production cluster, you can contact our technical support team for assistance.
naming convention : md-1-big-SI_table_column_idx.db