Performance of AWS Graviton2 & gp3 Cassandra®

Instaclustr is now offering AWS Graviton2 disks with gp3 volumes for managed Apache Cassandra®, providing customers with both performance and financial benefits.

Instaclustr has released AWS EC2 instances on Graviton2 processors combined with gp3 volumes for Instaclustr managed Apache Cassandra.

Benchmarking Cassandra node sizes

Instaclustr conducted benchmarking with each of our production Cassandra node sizes, to compare the performance between AWS EC2 instances running x86 processors with gp2 EBS volumes, against AWS EC2 instances running Graviton2 ARM-based processors with gp3 EBS volumes. We were keen to put the enhanced processor and volumes through their paces and test whether it delivered the promised improved price-for-performance.

Instaclustr uses the cassandra-stress tool for node performance testing. We used the tool to determine the maximum operations rate that a cluster can sustain without being overloaded, and then used those results to compare performance between node sizes. The testing was performed with both small and medium sized payloads, and insert, read, and mixed operation types.

The node performance test preloads the cluster with data, making sure the pre-existing data is at least 5 times the max heap for the cluster, so Cassandra will not always read from the in-memory cache. If it only read from cache, the resulting performance would be higher than real-world performance, and the disk would not be tested properly.

Ensuring we are getting performance information about pre-existing data written to disk is particularly important in this case, since the Graviton2 node sizes also have different gp3 EBS volumes attached, while x86 based node sizes have gp2 volumes attached. We want to make sure that we are properly testing the read performance from disk during the tests.

Results

We found that the insert and mixed tests using medium sized payloads had a large variance in the results. We use the rate of increase in pending compactions to assess whether a cluster was overloaded, and found this criteria unreliable since the rate of increase in pending compactions was quite volatile with a larger write payload. As a result, we decided not to use the insert-medium test results. Further, a degree of caution should be taken when interpreting the mixed-medium test results. We have given these results less weight in our performance comparison and would suggest that they not be relied on too heavily in any other assessments.

R6g.Large

The following is our experimental setup for testing the r6g.large node:

Baseline node: resizeable-large(r5-l)-v2 (GP2)
Test node: CAS-PRD-r6g.large-3200 (GP3)
Cassandra version 4.0.1
Disk size of 3200GiB
Repair and backups disabled
3 nodes cluster
Client encryption disabled

We note that in the results below, the read-medium operations rate for the test node is decreased by 20% from baseline. While we have not yet established a root cause for this unexpected degradation, r6g.2xlarge comparison for the same test did show the expected improvements when moving to Graviton2 with gp3. This finding highlights that each use case can have its own performance characteristics and we recommend carefully managing any migration to confirm performance for your use case.

Here is a sample of the benchmark results:

Operation	Operation Rate Test node: CAS-PRD-r6g.large-3200)	Operation rate Baseline: resizeable-large(r5-l)-v2)	% increase in OPS rate
insert-small	10355	5107	102.76%
Insert-medium
read-small	5578	2871	94.29%
read-medium	1364	1723	-20.84%
mixed-small	4917	3029	62.33%
mixed-medium	362	345	4.93%

Table 1. Benchmark results for r6g.large node

R6g.2xlarge

The following is our experimental setup for performance testing the r6g.2xlarge node:

Baseline node: resizeable-large(r5-2xl)-v2 (GP2)
Test node: CAS-PRD-r6g.2xlarge-3200 (GP3)
Cassandra version 4.0.1
Disk size of 3200GiB
Repair and backups disabled
3 nodes cluster
Client encryption disabled

Operation	Operation Rate Test node (CAS-PRD-r6g.2xlarge-3200)	Operation Rate Baseline node (r5-2xl-3200-v2)	% increase in OPS rate
insert-small	58239	55865	4.25%
insert-medium
read-small	69101	43863	57.54%
read-medium	46977	29985	56.67%
mixed-small	34385	29711	15.73%
mixed-medium	316	288	9.72%

Table 2. Benchmark results for r6g.2xlarge node

During the insertion tests we noticed that our single cassandra-stress application itself had reached a limit, and that it couldn’t write at more than a certain rate. Thus, we had to run multiple cassandra-stress processes concurrently, so that the write-limit this node size could handle was tested properly.

Improving Performance With the Amazon Corretto Crypto Provider

We faced a performance issue where Cassandra Graviton2 nodes initially had a much lower performance compared to x86 based nodes. A similar situation faced our Apache Kafka® team when they developed Kafka for Graviton2 nodes. They found the addition of Amazon Corretto Crypto Provider (ACCP) for client broker encryption overcame the performance degradation.

We took a r6g.xlarge Graviton2 node size as an example, ran the cassandra-stress performance testing with and without ACCP, compared the results and found that the r6g.xlarge Graviton2 node with ACCP has better performance. Here is a sample of the results from the comparison of r6g.xlarge with and without ACCP.

Operation	Operation Rate r6g.xlarge with ACCP	Operation Rate r6g.xlarge without ACCP	% increase in OPS rate
read-small	15148	13298	13.01%
read-medium	13593	12457	8.72%
insert-small	23488	18293	28.40%
insert-medium	255	211	20.85%
mixed-small	12504	12235	2.20%
mixed-medium	547	592	-7.60%

Table 3. Benchmark results for r6g.xlarge with and without ACCP

From the results above we can see that the combination of Graviton2 and gp3 outperforms the x86 and gp2 in the majority of tests. In some cases like, r6g.large read medium, the performance has reduced from the baseline, but it is still comparable.

Sign up for a free trial on our Console today to see the improved performance with our managed Cassandra on Graviton2 and gp3, or resize your existing clusters to Graviton2 and gp3 using our in-place Data Center resizing.

Interested to learn more about Instaclustr run Apache Cassandra?

Get in touch

Instaclustr Performance Test of AWS Graviton2 and gp3 for Instaclustr Managed Apache Cassandra®

Benchmarking Cassandra node sizes

Results

R6g.Large

R6g.2xlarge

Improving Performance With the Amazon Corretto Crypto Provider

Get the latest articles for open sourceIn your inbox

Sign upto ourNewsletter

Get the latest articles for open source
In your inbox

Sign up
to our
Newsletter