Managed Apache Kafka on AWS

Deploy and run Apache Kafka on AWS to create highly available clusters.

Apache Kafka: The Platform for Building Real-Time Streaming Data Pipelines and Applications

Our fully managed and hosted Apache Kafka as a Service is available on AWS and enables you to build fast and scalable distributed streaming applications. Our service lets you deploy Kafka on AWS to create highly available clusters with pre-configured and fully optimized settings. Our Kafka optimization techniques are based on best practices developed from thorough and exhaustive testing to match different real-world use cases.

Running Kafka on AWS (or Your Cloud of Choice)

Running on the Instaclustr Managed Platform with your chosen cloud provider is easy. You have two options: simply run Instaclustr managed Kafka from within Instaclustr’s AWS accounts, or run in your own cloud provider account. If you are interested in the latter option, get in touch with our Sales team.

Best Practices for Running Managed Kafka on AWS

The first step in deploying Kafka on AWS is deciding the correct (Amazon EC2) instance type for Kafka nodes (brokers). This important choice determines the performance and throughput of your cluster, as well as the cost of running it on AWS. It is a crucial step and often involves a trade-off between cost and performance. Amazon EC2 offers a huge number of instance types with varied combinations of CPU, memory, and disk hardware to suit different purposes and applications. Instaclustr has made this step easier for you by narrowing down this choice to a handful of instance types that offer the best return on investment for a Kafka deployment. Our research and choice of instance types are based on Kafka’s architecture and internals, AWS features, a cost-vs-value analysis, and, most importantly, real-world use cases of Kafka.

Learn how to store Kafka data to Amazon S3 via the Instaclustr console

Learn how to create a Kafka Cluster on the Instaclustr console

Kafka and Cassandra in Action: Explore How We Built a Massively Scalable Anomaly Detection Application

The 10-part blog series showcases a detailed anomaly detection application we deployed on Amazon EKS and integrated with a massive-scale Apache Kafka and Cassandra data pipeline on AWS, all through the Instaclustr Managed Platform. The series highlights best practices, performance tuning, monitoring and tracing capabilities, and above all demonstrates how a massively scalable Kafka-Cassandra data pipeline can be architected to handle and detect anomalies from billions of daily transactions.

Explore the 10-part series by category.

In this blog we introduce the main motivation behind the project, and cover functionality and initial test results beginning with Cassandra.

Learn More

Learn how to provision Cassandra and Kafka clusters automatically with Instaclustr’s provisioning API.

Learn more

In this post, we generate high volume load for Kafka, the log aggregation system that operates via a publish-subscribe mechanism.

Learn More

Metrics were added to compute and report CPU utilization, memory, rate-of-event production, and producer latency.

Learn More

We explore how to better understand an open source system using Prometheus for distributed metrics monitoring.

Learn More

In this post, we look at another way of increasing visibility into a system using OpenTracing for distributed tracing.

Learn More

We explore deploying the Anomalia Machina application on Kubernetes with the help of Amazon EKS.

Learn More

We deploy the instrumented application in a cloud production environment.

Learn More

We test out the application to see how anomaly detection can scale on small Kafka and Cassandra Instaclustr production clusters.

Learn More

Our final blog of the Anomalia Machina series focuses on scaling the application out from 3 to 48 Cassandra nodes. The scale results were impressive: 574 CPU cores (across Cassandra, Kafka, and Kubernetes clusters), 2.3 million writes/s into Kafka during its peak, and 220,000 anomaly checks per second (sustainable). In total, the application handled, a massive 19 billion anomaly checks per day.

Learn More

Apache Kafka Benchmarks for AWS

We have recently completed an extensive benchmarking exercise to help our customers in evaluating the choice of instance types for Kafka.