What Is Amazon Keyspaces (for Apache Cassandra)?
Apache Cassandra is a highly scalable, open source NoSQL database developed to handle large amounts of data across many commodity servers. In the AWS ecosystem, this functionality is delivered through a managed service known as Amazon Keyspaces (for Apache Cassandra). Amazon Keyspaces offers a serverless database solution that eliminates the need to manage infrastructure for Cassandra databases.
Amazon Keyspaces ensures compatibility with Cassandra Query Language (CQL), allowing for migration and integration of existing Cassandra applications. It provides automatic scaling and high availability, without requiring deep technical knowledge of Cassandra, and easily integrating with other AWS services.
The differences between Apache Cassandra and AWS Keyspaces
Feature | Apache Cassandra | Amazon Keyspaces (for Apache Cassandra) |
Management | Requires full infrastructure management, including hardware provisioning, software updates, and performance optimization. | Fully managed by AWS: eliminating the need for infrastructure management. AWS handles provisioning, updates, and scaling. |
Scalability and Availability | Can be configured for scalability and high availability through manual setup; high availability by default of Cassandra’s architecture. but requires manual setup and expertise. | Automatically scales to meet application demands and ensures high availability by default. |
Infrastructure | Organizations are responsible for deploying and managing clusters across servers, but a managed service can do this for them. | Serverless architecture, meaning AWS manages server provisioning and cluster configuration automatically. |
Operational Overhead | Operational overhead due to the need for infrastructure and database management if not using a managed service. | Minimal operational overhead, as AWS handles administrative tasks, allowing developers to focus on application development. |
Skill Requirement | Requires expertise in managing and configuring Cassandra clusters or support via a managed service. | No deep Cassandra expertise required; AWS automates infrastructure tasks and scaling. |
Learn more in our detailed guide to managed Cassandra
How Amazon Keyspaces works
Amazon Keyspaces operates as a serverless database service, meaning it automatically handles the infrastructure management tasks typically associated with running a database. This includes provisioning servers, configuring clusters, and managing software updates. Users perform queries using the familiar CQL language provided by Cassandra.
Source: Amazon
At its core, Amazon Keyspaces uses a partitioned data model similar to Apache Cassandra. Data is distributed across partitions, which allows for horizontal scaling and ensures high availability. When a read or write request is made, Amazon Keyspaces routes the request to the appropriate partition, balancing the load across multiple servers. This mechanism supports the service’s ability to handle large volumes of simultaneous requests with low latency.
To enable automatic scaling, the service monitors workload patterns and adjusts capacity dynamically to meet demand. This elasticity is especially important for applications with unpredictable or highly variable workloads.
Security in Amazon Keyspaces is managed through AWS Identity and Access Management (IAM). Developers can define granular access policies to control who can read or write data. Additionally, data is encrypted both at rest and in transit, adhering to industry-standard security practices.
Amazon Keyspaces also integrates with other AWS services. For instance, it can be used alongside Amazon S3 for data storage or AWS Lambda for serverless computing.
Learn more about Data architecture principles
Amazon Keyspaces: Key use cases
Build applications that require low latency
Applications requiring low latency can benefit from Amazon Keyspaces. With its partitioned data model and high throughput, Keyspaces can handle large volumes of read and write operations with minimal delay. This makes it suitable for real-time applications like live streaming, gaming, and financial trading platforms, where performance is critical.
Move your Cassandra workloads to the cloud
Migrating existing Cassandra workloads to Amazon Keyspaces offers different advantages. Amazon Keyspaces maintains compatibility with CQL, which means your existing Cassandra applications don’t require extensive rewrites.
Build Applications in AWS Using Open Source Technologies
Because Amazon Keyspaces uses the same CQL as Apache Cassandra, developers familiar with Cassandra can adopt Amazon Keyspaces without needing to learn new skills.
Additionally, the compatibility with open source frameworks means you can leverage a broad range of tools and libraries that work with Apache Cassandra.
Learn more in our detailed guide to Cassandra architecture
Quick tutorial: Getting started with Amazon Keyspaces (for Apache Cassandra)
The following tutorial is based on the official Amazon Keyspaces documentation.
Step 1: Create a Keyspace and a Table
Creating a keyspace and a table in Amazon Keyspaces is a straightforward process that involves a few steps within the AWS Management Console.
Creating a keyspace
A keyspace in Amazon Keyspaces groups related tables that are relevant for one or more applications. It defines the replication strategy for all the tables it contains.
Steps to create a keyspace:
- Sign in to the AWS Management Console and navigate to Amazon Keyspaces.
- In the navigation pane, select Keyspaces.
- In the Keyspace name box, enter catalog as the name for your keyspace.
- Under AWS Regions, confirm that Single-Region replication is the replication strategy for the keyspace.
- Click on Create keyspace.
- To verify keyspace creation, go to the navigation pane, select Keyspaces, and locate your keyspace catalog in the list.
Creating a table
A table is where your data is organized and stored. The primary key of your table determines how data is partitioned and indexed.
Steps to create a table:
- Sign in to the AWS Management Console, navigate to Amazon Keyspaces, and click Keyspaces.
- Choose the keyspace catalog where you want to create the table.
- In the Table name box, enter
famous_films
as the name for your table. - In the Columns section, add the following columns and data types:
year (int)
title (text)
director (text)
producer (text)
length (int)
genre (text)
rating (text)
- Set the Partition Key:
- Choose title and year as the partition keys.
- Add Clustering Columns:
- Choose genre and director.
- For genre, select ASC to sort in ascending order.
- In the Table settings section, choose Default settings.
- Click on Create table.
- To verify the table creation, go to the navigation pane, select Tables, and confirm that your table
famous_films
is listed. Select the table name and verify that all columns and data types are correct.
Step 2: Create and read data
Create data
To insert data into your famous_films
table, use the INSERT
statement. Here’s how you can add a single row:
- Open AWS CloudShell and connect to Amazon Keyspaces using the following command, replacing us-east-1 with your specific region:
1cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl
Note: If
csqlsh
is not installed on your cloud shell, you can install it using this command:1pip install cqlsh - Also install the required tools using the following command:
1pip install --user cqlsh-expansion
- Now initialize it using the following command:
1cqlsh-expansion.init
- Once connected, set the write consistency for your session to
LOCAL_QUORUM
by running:1CONSISTENCY LOCAL_QUORUM; - Insert a single record into the
famous_films
table:12INSERT INTO catalog.famous_films (title, year, genre, director, producer, length, rating)VALUES ('Inception', 2010, 'Sci-Fi', 'Christopher Nolan', 'Emma Thomas', 148, 'PG-13'); - Verify the data was added successfully by running:
1SELECT * FROM catalog.famous_films;
The output should look similar to this:
Step 3: Insert multiple records from a CSV file
To insert multiple records, follow these steps:
- Download the sample CSV file keyspaces_sample_table.csv. You can download the file here.
- Open AWS CloudShell and connect to Amazon Keyspaces:
1cqlsh-expansion cassandra.us-east-1.amazonaws.com 9142 --ssl
- Specify the keyspace:
1USE catalog;
- Set the write consistency:
1CONSISTENCY LOCAL_QUORUM;
- Upload the CSV file to AWS CloudShell using the upload option and note the file path.
- At the keyspace prompt, run:
12COPY famous_films (title, year, genre, director, producer, length, rating)FROM '/home/cloudshell-user/keyspaces_sample_table.csv' WITH header=TRUE;
- Verify the data by running:
1SELECT * FROM famous_films;
The output should be similar to:
Read data
To read data from the famous_films
table, use the SELECT
statement. Here are some examples:
Select all data:
1 |
SELECT * FROM catalog.famous_films; |
This returns all columns and rows in the table.
Select specific columns:
1 |
SELECT title, genre, year FROM catalog.famous_films; |
This returns only the title, genre, and year columns.
Select specific rows:
1 |
SELECT * FROM catalog.famous_films WHERE year=2010 AND title='Inception'; |
This filters the data to return only rows where the year is 2010 and the title is ‘Inception’.
Leveraging the power of Instaclustr for Apache Cassandra on AWS
Self-managed, and even Apache Cassandra on AWS, can be complex to manage without the required expertise. Boost your Apache Cassandra operations with Instaclustr, a trusted managed service provider known for its prowess in supporting this highly scalable and distributed NoSQL database. With a strong focus on reliability, performance, and security, Instaclustr provides a range of services to ensure smooth operations and optimal utilization of Cassandra clusters.
- Effortless Deployment and Management: Instaclustr takes the complexity out of managing and deploying Cassandra clusters. We offer automated provisioning and configuration of Cassandra nodes, making database setup and scaling a breeze. We also handle node replacements, repairs, and upgrades, freeing your developers to focus more on your applications and less on infrastructure maintenance.
- Proactive Monitoring and Optimization: With Instaclustr, you get real-time tracking of the health and performance of your Cassandra clusters. Through proactive monitoring and alerting, we can identify potential issues early and take necessary actions to prevent downtime or performance degradation. Plus, our performance tuning services ensure efficient query execution and minimal latency.
- Uncompromising Security and Compliance: Protecting your data is our priority. Our robust security measures, including encryption at rest and in transit, role-based access control, and regular security audits, safeguard your Cassandra clusters. Plus, we provide the tools and guidance you need to comply with industry standards and regulations like GDPR and HIPAA.
- 24/7 Expert Support: Our experienced team of Cassandra experts is here to provide you with 24/7 technical support. You can count on us to promptly troubleshoot issues, provide best practice guidance, and assist with any Cassandra-related challenges.
- Why Choose Managed Cassandra Over AWS Keyspaces: While AWS Keyspaces offers a managed service for Cassandra, opting for Instaclustr’s managed Cassandra provides you with more flexibility and control. Instaclustr allows for greater customization of your clusters, which can be critical for businesses with specific performance or security requirements. Additionally, our dedicated support and tuning services ensure that your Cassandra implementation is optimized to suit your unique business needs.
Ready to experience the best in Cassandra management? Explore Instaclustr’s solutions for Apache Cassandra today.
For more information: