Deploying Apache Kafka® with Docker: A practical guide

What is Apache Kafka?

Apache Kafka is a distributed event streaming platform for high-throughput, fault-tolerant, and scalable data streaming. It enables real-time data processing by allowing applications to publish, subscribe to, store, and process streams of records in a distributed manner.

Kafka consists of key components:

Producers send data (events) to Kafka topics.
Brokers store and manage data streams.
Consumers read and process data from topics.
Apache ZooKeeper™ manages cluster coordination and leader election; however, ZooKeeper will soon be replaced with KRaft-based clusters.

Kafka is commonly deployed via containers. Docker is a popular containerization platform that makes it possible to package Kafka in a container, which includes all the dependencies it needs to run, making it easy to deploy in diverse environments.

What is Docker?

Docker is an open source platform that automates application deployment using containerization. It packages applications and their dependencies into lightweight, portable containers that run consistently across different environments.

Key components of Docker include:

Docker Engine: The runtime that builds and runs containers.
Docker Images: Pre-configured templates containing the application and dependencies.
Docker Containers: Running instances of Docker images.
Docker Compose: A tool for defining and managing multi-container applications.

Docker ensures applications run the same way in development, testing, and production environments. It is widely used for microservices, DevOps workflows, and cloud-native development.

Getting started with Apache Kafka on Docker

Install Docker

To install Docker on Ubuntu, follow these steps:

Update the package index

sudo apt-get update

1

sudo apt-get update
Install required dependencies

sudo apt-get install ca-certificates curl gnupg

1

sudo apt-get install ca-certificates curl gnupg

Add Docker’s official GPG key

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | 
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

1

2

3

4

sudo install -m 0755 -d /etc/apt/keyrings

curl -fsSL https://download.docker.com/linux/ubuntu/gpg |

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc

sudo chmod a+r /etc/apt/keyrings/docker.asc

Apache Kafka Docker tutorial screenshot

Set up the repository

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

1

2

3

4

echo \

"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \

$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \

sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Apache Kafka Docker tutorial screenshot

Update the package index again

sudo apt-get update

1

sudo apt-get update
Install Docker Engine, CLI, and containerd

sudo apt-get install docker-ce docker-ce-cli containerd.io

1

sudo apt-get install docker-ce docker-ce-cli containerd.io
Verify the installation: Run the following command to check if Docker is installed correctly:

docker --version

1

docker --version

It should return an output like:

docker version X.Y.Z, build ABCD123

1

docker version X.Y.Z, build ABCD123

By default, Docker is installed at /usr/bin/docker, and the service should start automatically. If needed, you can start it manually using:

sudo systemctl start docker

1	sudo systemctl start docker

To enable it to start on boot:

sudo systemctl enable docker

1	sudo systemctl enable docker

Apache Kafka Docker tutorial screenshot

Installing Docker Compose

To install Docker Compose, you need to set up the appropriate repository and install the package based on your operating system.

Install on Ubuntu and Debian
1. First, update the package index:
  
  sudo apt-get update
  
  1
  
  sudo apt-get update
2. Then, install the Docker Compose plugin:
  
  sudo apt-get install docker-compose-plugin && apt-get install docker-compose
  
  1
  
  sudo apt-get install docker-compose-plugin && apt-get install docker-compose
Install on RPM-based distributions (CentOS, Fedor, RHEL, SLES)
1. Update the system packages
  
  sudo yum update
  
  1
  
  sudo yum update
2. Then, install Docker Compose:
  
  sudo yum install docker-compose-plugin
  
  1
  
  sudo yum install docker-compose-plugin
Verify installation
1. After installation, check that Docker Compose is installed correctly by running:
  
  docker-compose version
  
  1
  
  docker-compose version
2. This command should return an output similar to:
  
  docker-compose version vN.N.N
  
  1
  
  docker-compose version vN.N.N
  
  Replace vN.N.N with the installed version number.

Run Kafka with Docker

To set up Kafka using Docker, you need a docker-compose.yml file that defines the necessary services, including ZooKeeper and a Kafka broker. Instead of writing the configuration manually, you can use a pre-configured setup:

Clone the Kafka Docker project from GitHub:

git clone https://github.com/conduktor/kafka-stack-docker-compose.git

1

git clone https://github.com/conduktor/kafka-stack-docker-compose.git
Navigate into the cloned directory and list the available files:

cd kafka-stack-docker-compose ls -l

1
2

cd kafka-stack-docker-compose
ls -l

You’ll find multiple configuration files, including zk-single-kafka-single.yml, which sets up a single ZooKeeper instance and one Kafka broker.
Start the Kafka cluster using Docker Compose:

docker-compose -f zk-single-kafka-single.yml up -d

1

docker-compose -f zk-single-kafka-single.yml up -d

The -d flag runs the containers in the background.
Verify that the services are running:

docker-compose -f zk-single-kafka-single.yml ps

1

docker-compose -f zk-single-kafka-single.yml ps

You should see both Kafka and ZooKeeper in a running state. Kafka will be accessible at localhost:9092.

Running commands

Once Kafka is running, you can interact with it using Kafka commands. There are two ways to execute commands:

From inside the Kafka container:

docker exec -it kafka1 /bin/bash

1	docker exec -it kafka1 /bin/bash

Inside the container, you can run Kafka commands directly:

kafka-topics --version

1	kafka-topics --version

Apache Kafka Docker tutorial screenshot

From the host machine
To run Kafka commands outside the container, install Java and Kafka binaries on your system:

Once installed, you can use Kafka commands such as:

kafka-topics.sh

1	kafka-topics.sh

Stopping Kafka on Docker

To stop the running Kafka services, use:

docker-compose -f zk-single-kafka-single.yml stop

1	docker-compose -f zk-single-kafka-single.yml stop

Apache Kafka Docker tutorial screenshot
To completely remove all containers and resources, use:

docker-compose -f zk-single-kafka-single.yml down

1	docker-compose -f zk-single-kafka-single.yml down

Apache Kafka Docker tutorial screenshot

Related content: Read our guide to Apache Kafka tutorial

5 Best practices for running Kafka with Docker

Here are some useful practices to run Kafka efficiently and reliably in a Docker environment.

1. Utilize official Docker images

Always use the official Docker images for Kafka and ZooKeeper provided by Apache. These images are well-maintained, regularly updated, and tested for compatibility with the latest Kafka versions. Using third-party images may introduce security risks, compatibility issues, or lack of proper support.

For Apache’s official image:

docker pull apache/kafka

1	docker pull apache/kafka

Using verified images ensures stability and simplifies maintenance.

2. Configure listeners and advertised listeners

Kafka clients need to know how to connect to the broker, both from within Docker and externally. Improper listener configuration can cause connection issues, especially when running Kafka in a containerized environment.

Define KAFKA_LISTENERS and KAFKA_ADVERTISED_LISTENERS properly in your docker-compose.yml file:

environment:
KAFKA_LISTENERS: PLAINTEXT://:9092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092

1

2

3

environment:

KAFKA_LISTENERS: PLAINTEXT://:9092

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092

For multi-container deployments, use proper DNS resolution to ensure the advertised listeners reflect the correct hostnames accessible by clients.

3. Implement data persistence

By default, Kafka stores data in /var/lib/kafka/data inside the container. If containers are restarted without persistent storage, all messages will be lost. To ensure data durability, mount a volume:

volumes:
  - kafka_data:/var/lib/kafka/data

1 2	volumes: - kafka_data:/var/lib/kafka/data

Define this volume in docker-compose.yml:

volumes:
  kafka_data:

1 2	volumes: kafka_data:

This setup prevents data loss in case of container restarts or upgrades.

4. Secure network configurations

Kafka should not expose open ports unnecessarily. Use Docker’s network configuration to restrict access:

Place Kafka and ZooKeeper in a dedicated Docker network:

docker network create kafka-net

1	docker network create kafka-net

Define the network in docker-compose.yml:

networks:
  default:
    name: kafka-net

1

2

3

networks:

default:

name: kafka-net

Use authentication and encryption (SASL, SSL) for securing Kafka traffic in production.

5. Monitor resource allocation

Kafka can be resource-intensive. Running it in Docker requires monitoring CPU, memory, and disk usage to prevent performance degradation. Use Docker’s built-in monitoring tools:

Check container resource usage:

docker stats kafka1

1	docker stats kafka1

Use Prometheus and Grafana for advanced monitoring:

services:
  prometheus:
    image: prom/prometheus  
    ports:  
      - "9090:9090"

1

2

3

4

5

services:

prometheus:

image: prom/prometheus

ports:

- "9090:9090"

Adjust JVM heap size (KAFKA_HEAP_OPTS) in docker-compose.yml to optimize performance:

environment:
  KAFKA_HEAP_OPTS: "-Xmx1G -Xms1G"

1 2	environment: KAFKA_HEAP_OPTS: "-Xmx1G -Xms1G"

Maximize Kafka’s power with Instaclustr

Apache Kafka is a phenomenal tool for building real-time data pipelines and streaming applications, but effectively managing and scaling Kafka can be a daunting challenge. That’s where Instaclustr for Apache Kafka steps in, making your Kafka experience seamless, efficient, and worry-free.

Instaclustr offers a fully-managed Kafka service, handling everything from deployment to ongoing performance optimization. By taking care of Kafka’s complexity, it lets businesses focus on what truly matters—extracting value from their data. This includes automated monitoring, patches, and updates, saving your team time and reducing operational burdens. The result? A robust streaming platform you can depend on without the headaches of managing it in-house.

Built for reliability and scalability

With Instaclustr for Apache Kafka, you get an environment built for reliability and high availability. Their managed service runs on fault-tolerant infrastructure with built-in replication, ensuring that your data streams are always available, even during unexpected disruptions. And when your business grows, scaling your Kafka cluster becomes effortless, with Instaclustr’s expert team offering guidance to optimize performance.

Open source freedom with enterprise-grade security

Instaclustr is committed to open-source technology, giving you vendor-neutral flexibility and avoiding lock-in. At the same time, security is never compromised. Instaclustr for Kafka is equipped with enterprise-grade security features, including encryption, role-based access controls, and compliance with industry standards.

Switching to Instaclustr for Apache Kafka means more than just outsourcing management; it’s about empowering your team with a reliable, scalable, and efficient streaming solution. Simplify your Kafka operations, and take your data-driven initiatives to the next level with a trusted partner by your side.

For more information:

Deploying Apache Kafka® with Docker: A practical guide

What is Apache Kafka?

What is Docker?

Getting started with Apache Kafka on Docker

Install Docker

Installing Docker Compose

Run Kafka with Docker

Running commands

Stopping Kafka on Docker

5 Best practices for running Kafka with Docker

1. Utilize official Docker images

2. Configure listeners and advertised listeners

3. Implement data persistence

4. Secure network configurations

5. Monitor resource allocation

Maximize Kafka’s power with Instaclustr

Built for reliability and scalability

Open source freedom with enterprise-grade security

Spin up a clusterIn minutes

Spin up a cluster
In minutes