Whether you’re integrating multiple microservices, looking to improve app reliability, or building a new streaming app, you might need a message queue (MQ) or message broker platform. These types of software pass messages from producing apps or services to consuming apps or services.
Two of the most popular platforms for handling messages are Apache Kafka and RabbitMQ. At a high level, they have similar functions, though there are important differences between them. Understanding those differences can help you choose one or the other for your particular use case.
Table of Contents
- Asynchronous Messaging Patterns
- What Is RabbitMQ
- What Is Apache Kafka
- Kafka and RabbitMQ Messaging Patterns
- Security and Operations
- Apache Kafka Use Cases
- RabbitMQ Use Cases
Asynchronous Messaging Patterns
Both Kafka and RabbitMQ use asynchronous messaging to pass information from producing apps to consuming apps. The messaging is considered asynchronous because producing and consuming apps do not need to be active at the same time. The producer can deliver a message, and if the consumer is not currently available or able to receive it, the message is stored until the consumer is ready. This approach to messaging is similar to asynchronous email or texting rather than synchronous phone calls or video conferencing: with Kafka and RabbitMQ, messaging does not have to occur in real time.
There are two primary asynchronous messaging patterns: message queues and publish/subscribe patterns.
With the message queue pattern, a producing app delivers messages to a queue. When the consuming app is ready to receive messages, it connects to the queue and retrieves the messages, removing them from the queue. You might have multiple consuming apps, but each message is consumed by only one consumer.
With the publish/subscribe (pub/sub) messaging pattern, producers publish messages, and multiple consumers are able to consume each message. When consuming apps are interested in a particular producer’s messages, they subscribe to a channel where that producer will send its messages.
This pattern is typically used when you need a message or event to trigger multiple actions. Unlike the message queue pattern, pub/sub messaging ensures that consuming apps receive messages in the same order in which they were received by the messaging system.
What Is RabbitMQ
RabbitMQ is an open source distributed message broker. It is often labeled as a “mature” platform (it was first released in 2007) and grouped with “traditional” messaging middleware platforms, such as IBM MQ and Microsoft Message Queue.
Developers often choose RabbitMQ for its flexibility. It can handle complex routing scenarios, and it supports multiple messaging protocols, including AMQP, MQTT, and STOMP. It can be deployed in distributed configurations for scaling and delivering high availability.
RabbitMQ has a large community. Developers can easily find clients, plug-ins, and guides, and they can opt for commercial support through Pivotal (which was acquired by VMware). RabbitMQ also has a large number of high-profile enterprise users, including Reddit, Robinhood, T-Mobile, trivago, Accenture, Alibaba Travel, and more.
The RabbitMQ architecture includes producers, exchanges, queues, and consumers. A producer pushes messages to an exchange, which then routes messages to queues (or other exchanges). A consumer then continues to read messages from the queue, often up to a predetermined limit of messages.
A RabbitMQ queue is a sequential data structure. Producers add data to the tail of the queue; consumers receive data from the head of the queue. The queues are “first in, first out” with RabbitMQ: the first message in the queue is consumed first. Queues have some mandatory properties (such as a name) and some optional properties (such as arguments used by plug-ins).
RabbitMQ message exchanges—which determine how messages are routed—provide a great deal of flexibility. With RabbitMQ, producers send messages to one of four exchange types:
- Direct exchanges route messages according to the routing key that the message carries. The routing key is a string of words, separated by periods, that has some relevance to the message.
- Fanout exchanges route messages to all available queues. In this broadcasting type of exchange, the routing key is ignored.
- Topic exchanges route messages to one or more queues according to a complete or partial match with the routing key.
- Header exchanges route messages based on the message headers, which can contain more attributes than a routing key.
These exchange types enable RabbitMQ to handle complex routing scenarios with multiple consuming apps or services.
What Is Apache Kafka
Apache Kafka is an open source distributed event-streaming platform. Originally developed by LinkedIn to track website activity, Kafka today is generally employed for building real-time data pipelines and streaming apps. Often considered the leading streaming and queuing technology for large-scale, always-on, and event-driven apps, Kafka is regularly among the top five most active projects of the Apache Software Foundation.
Developers choose Kafka for several reasons:
- Scalability: Kafka’s distributed architecture enables significant horizontal scalability.
- Performance: Kafka is fast! It can process millions of messages per second with relatively modest resources.
- Flexibility: Designed to interface with a variety of systems, Kafka has useful, intuitive APIs.
- Availability: Kafka delivers high availability through load balancing and data replication.
- Community: As part of the Apache Software Foundation, Kafka has a rich ecosystem and community.
- Strong reputation: Kafka is used by leading, high-profile organizations, including not only LinkedIn but also Netflix, Twitter, Spotify, Pinterest, Airbnb, Uber, and many others.
The Kafka architecture comprises producers, consumers, clusters, brokers, topics, and partitions. Producers send records to clusters, which store those records and then pass them to consumers. Each server node in the cluster is a “broker,” which stores the data provided by the producer until it is read by the consumer.
Instead of “queues,” Kafka uses “topics.” A topic is a stream of data comprising individual records—which, as the introduction to Kafka suggests, is like a folder in a filesystem. Each topic is split into partitions, which are unchangeable sequences of records where messages are appended. Each record has a sequential ID called an “offset,” which sets its place in line. A producer appends records to a topic partition, and a consumer subscribes to changes.
Kafka can spread messages across partitions. You might decide to place those partitions on multiple brokers so that multiple consumers can read from a topic in parallel while also enabling a topic to hold more data than could fit on any one machine. Alternatively, producers can create logical message streams, which can help ensure the delivery of messages in the right order for consumers.