Platform
Platform
Product
Hosting
- Cloud Providers
- Private On Prem
Pricing
Pricing
Services
Services
Support
Services
- Consulting
- Training
Partners
Partners
Resources
Resources
Content Library
Resources

Documentation

Cluster Health Check

Instaclustr’s Cluster Health page exposes a number of indicators to help you understand your cluster’s long term performance.

To access the Cluster Health page, navigate to the Monitoring page of your cluster and click on the Health tab.

There are three potential states for each indicator:

Green represents a healthy state
Amber represents a warning state; and
Red represents failed state

For Warning and Failed states, you can click on Problem Information to get specific information about which keyspaces and tables are affected, what the issue is, and how you can potentially fix the issue.

Table of Contents

Disk Usage Indicator

The Disk Usage indicator checks the percentage of space used on each node. If the disk usage is over 75%-80% in the last hour, it indicates that the node is filling up, and it is very likely that the node cannot provide enough work space for normal Cassandra operations. Please refers to Disk Usage for more details.

Suggested fix for non-healthy states:

Remove excess data from the cluster
Add more nodes to the cluster

Partition Size Indicator

Partition Size indicator checks the size of the largest partition in each table. We recommended limiting the maximum partition size to 10MB for optimal performance with 100MB as un upper limit for ongoing stability. Large partitions may significantly impact the performance of Cassandra operation. Please refer to Partition Size for more details.

Suggested fix for non-healthy states:

Remove the problem partition
Re-assess the data model as data may not be evenly distributed or is bunched into too few partitions

Replication Factor Indicator

The Replication Factor indicator checks the number of replicas set for each datacenter. A replication factor of at least 3 is required for Instaclustr SLAs to apply and highly recommended for data protection and high availability.

Suggested fix for non-healthy states:

Set the replication factor to three or larger for the problem datacenters (note: increasing replication factor requires repairs to be run after the change to ensure data is correctly distributed. Contact [email protected] for assistance with this operation.)

Replication Strategy Indicator

The Replication Strategy indicator checks the replication class used for each keyspace. NetworkTopologyStrategy is highly recommended to ensure data is replicated to minimise impact of likely failures in your infrastructure (e.g. replicate across AWS availability zones) and to enable additional data centers to be added to the cluster without table rebuilds.

Suggested fix for non-healthy states:

Change the replication class to NetworkTopologyStrategy for the problem keyspaces

Tombstones to Live Cells Indicator

The Tombstones to Live Cells indicator checks the average ratio of the number of tombstones and live cells per read in each table. High ratios of tombstones to live cells (greater than 5x as a starting guide) can cause substantially reduced performance in reads from a table. Please refers to Tombstones and Live Cells for more details.

Suggested fix for non-healthy states:

Tune the compaction strategy to more aggresively remove tombstones
Re-assess the data model

By Instaclustr Support

Previous Article Partition Size Next Article Request Latency

Need Support?

Experiencing difficulties on the website or console?

Status page for known incidents

Already have an account?

Log In to the Console

Need help with your cluster?

Contact Support

Why sign up?

To experience the ease of creating and managing clusters via the Instaclustr Console

Spin up a cluster in minutes