Cassandra Monitoring
Get API access to the information needed to effectively manage and review the health of your cluster.
Access Cassandra metrics
Our Cassandra Monitoring API enables access to Apache Cassandra performance metrics including, CPU utilization, disk utilization, reads and writes, latency tasks, pending compactions, live cells and tombstones per read, SSTable, thread pool task statistics, and more. The Instaclustr Console also delivers a visual view of these metrics as part of our monitoring dashboard.
For all Apache Cassandra clusters and nodes managed through the Instaclustr Managed Platform, you can start monitoring your Cassandra database in just a few minutes by creating your cluster from a centralized console. All data metrics data can be viewed either on a per-node basis or for all nodes in a cluster, helping you diagnose any issues and enabling better capacity planning. All available metrics are updated every 20 seconds.
Once you authenticate the monitoring API, the metric information viewable on our monitoring dashboard is available as part of our console
- CPU Usage
- Disk Usage
- Read+Write Per Second
- Pending Compaction
- Active/ Pending Repairs
- Partition Size
- Tombstones to Live Cells
- SSTable Per Reads
- Replication Strategy Indicator
- Thread Pool Metrics
Sustained high CPU usage is an indicator that your cluster is reaching processing capacity, and you may need to consider adding capacity to cope with any increase in load.
Keeping an eye on disk usage should be a key part of your capacity planning. Maintaining disk usage at less than 70% during normal operations is recommended.
This metric helps you identify changing levels of load on your cluster. A significantly uneven distribution could indicate driver misconfiguration or data model issues.
Compactions are a continuous, background process in Cassandra. A high or increasing number of pending compactions indicates that your cluster does not have sufficient capacity to process the level of operations it requires.
Repair is a Cassandra operation that ensures data consistency is eventually attained across the ring. Repairs are a scheduled operation and represent an additional load on the cluster.
This metric checks size of the largest partition in each table. We recommended limiting the maximum partition size to 10MB for optimal performance, with 100MB as an upper limit for ongoing stability. Large partitions may significantly impact the performance of Cassandra operation.
Checks the average ratio of the number of tombstones and live cells per read in each table. High ratios of tombstones to live cells (greater than 5x as a starting guide) can cause substantially reduced performance in reads from a table.
Monitor the latest mean and maximum metrics for each column family, averaged across the cluster.
High numbers of SSTables per read (more than 3 or 4, as a starting guide) can reduce read performance, and if read performance is below desired levels you may need to change compaction strategy for the affected column family.
This indicator checks the replication class used for each keyspace. NetworkTopologyStrategy is highly recommended to ensure data is replicated in order to minimize the impact of infrastructure failure.
These metrics are associated with each stage of Cassandra’s Staged Event-Driven Architecture (SEDA).