- White Paper
Overview
Many people consider Apache Cassandra® and DynamoDB as potential datastore technologies when looking to build high-scale, high-reliability services in the cloud. Both technologies are popular and well-proven to deliver at scale. However, choosing the technology most appropriate for your use case can have a significant impact on the cost of building, maintaining, and running your application.
This whitepaper considers a real-world use case, analyzes the costs of running on Instaclustr Managed Apache Cassandra vs DynamoDB, and discusses how the features and cost models of the two technologies could impact the architecture of your solution. The use case we are considering is at the heart of Instaclustr’s monitoring system, Instametrics.
The key attributes of the Instametrics cluster
- 36 i3.2xlarge nodes (co-hosting Apache Cassandra and Apache Spark) (this cluster runs continuously with no scaling up/down for peaks).
- Each metric event written is, on average, ~100 bytes of data.
- Baseline load (raw metrics received) of 3060 batch writes per second. Each batch contains ~150 rows for a total of ~460k writes/second baseload.
- Additional load when writing roll-up results in 16,200 batch writes/second. Each batch contains ~100 rows for a total of 1.6M writes/second from this load and total peak of just over 2M writes per second. This peak load occurs for about 1 minute out of every 5 (20% of the time).
- The baseline read load on the cluster is about 18,000 reads per second. Each read retrieves ~15 rows for a total baseline read load on the cluster of 270k rows/sec.
- Additional loads when reading data for the roll-ups is about 144,000 reads per second. These reads are actually using Cassandra functions to aggregate data before returning with each read using data from ~15 rows for 2.1M rows/sec read in total. The cluster is also at peak read load for about 20% of the time.
- The cluster currently stores around 54TB of data with a replication factor of 2.
- Fill out the form on the right to download the white paper.
Thank you for your submission
Download Resource
-
- Videos
InstaBlinks EP 27: OpenSearch Software Foundation announcement and more
Join Brian Graf as he talks with Product Manager Alex Bunday about the recent big announcement regarding OpenSearch and its transition to a new governance model under the OpenSearch Software Foundation at the Linux Foundation.
-
- Datasheets
Reduce infrastructure complexity. Deploy open-source solutions at scale.
NetApp® Instaclustr helps simplify data infrastructure management to scale and maximize the performance of your applications both on-prem and across clouds. With our managed solution, accelerate time to value, reduce operational costs, and deliver improved reliability for your data infrastructure.
-
- Datasheets
DZone: Open Source Data Management Practices and Patterns
This comprehensive guide is your roadmap to understanding and implementing cutting-edge open source technologies for scalable, secure, and efficient data solutions.