Apache Kafka has the default ability to allow a topic to be created on a broker when a message is written to it and when a topic with the name the message is attempting to be written to does not exist. This can be very helpful in early development or prototyping where code, topic names, and schemas are in flux. However, past that early stage, it is recommended that Kafka be configured to disable the auto-creation of topics from messages for a few reasons. In this article I am going to touch on two of these reasons that are also core principles of Kafka partitions and Kafka topic replication.
Topics get created by default with a single partition and a single replication. A topic with a single partition is essentially a bottleneck for throughput. If you are using Apache Kafka, odds are you are using it for distributed event streaming. Only having a single partition really doesn’t capitalize on the “distributed” part. This can lead to lower throughput because real-time processing solutions, such as Apache Flink, can have multiple consumers to read from a Kafka topic. Having them all have to wait for one partition can lead to timeouts, depending on timeout configuration and number of consumers trying to read from the topic, because that one partition simply cannot keep up with the number of requests coming in.
The recommended approach is not just to have multiple partitions, but also to properly test and assess what your data throughput needs are and to have a number of partitions that align with those goals. Not just on their own for Kafka, but also in conjunction with other aspects of the data pipeline.
Let’s explore an example scenario: say you have events being written to a Kafka topic (let’s call it input-topic) from a REST endpoint. They need to be processed and transformed in real time by a distributed processing engine like Apache Flink, then output to another Kafka topic (let’s call it output-topic), and then persisted to a database such as Apache Cassandra.
The place where most of the computational load is going to occur is in the Flink processor since it has to do everything from data transformations to maintaining and updating state (in more complex situations), so it is going to be the main driver behind the partitions of input-topic. This is because if the ideal throughput can’t be reached with a single parallelism (think “thread” — parallelism of 5 == 5 threads, basically), then the Flink job needs to scale out horizontally to achieve this. So the Flink job increases its parallelism to 10.
If the Kafka topic only has one partition, then the Kafka topic becomes the bottleneck, like I mentioned earlier. So it has to scale equally, to 10 partitions (it should be noted that it doesn’t necessarily have to be exactly 1 to 1 partitions for parallelism, be it Flink or another distributed processing engine, but it is best practice). If this gives the desired throughput with no persistent backpressure, then you found the ideal partition configuration.
The output-topic can be configured in the same way. It doesn’t have to match the same output partition numbers like the input-topic is recommended to, as long as there is no backpressure. What you need to consider for the output-topic partitions is what is needed for the records to be written to Cassandra efficiently.
The second item to consider is the replication factor. To borrow from Kafka’s own documentation to explain it:
“Kafka replicates the log for each topic’s partitions across a configurable number of servers (you can set this replication factor on a topic-by-topic basis). This allows automatic failover to these replicas when a server in the cluster fails so messages remain available in the presence of failures.”
The recommended default value is 3, and in my experience that is usually fine.
What has to be considered when increasing this value is the replication requirements that were configured. Replications can have the value for “min.insync.replicas” set to “all”, which requires that when a message (with the configuration value of aks=all) is written to Kafka, if it isn’t replicated on all of those replicas (3 in this case), then the message will be considered failed and an error will be returned.
This is good for making sure that if a Kafka node dies that no data is lost. However if you increase replicas and the volume of data is very large, it is possible that this will cause throughput to drop because messages won’t be sent to be processed downstream until all replicas are in sync.
These configurations can be changed depending on your needs.
Both partitions and the replication factor need to be taken into consideration when setting up a Kafka topic, but these considerations can too easily be overlooked. Make sure you understand best practices like these to ensure you get a stable and performant cluster.
To get started with Apache Kafka, sign up for a free trial of our Managed Kafka service.