When (and when not) to use Apache Kafka® Diskless Topics

I recently wrote a Visual Guide to Apache Kafka Diskless Topics, which introduces the main ideas behind Kafka Diskless Topics and links to the relevant Kafka KIPs. At present, the only accepted KIP is the high-level KIP-1150, and the only available implementation is Aiven’s “Inkless” fork.

So, what are some potential Kafka Diskless Topic use cases, based on an early understanding of the provisional architecture and limited published and internal benchmarking? Aiven’s public benchmark results are available here, along with some observations I’ve shared here. These are early performance results, and there may be improvements in the future. These observations are based on early benchmark data and provisional architectures; actual performance characteristics may change as Diskless Topics mature, and official Apache Kafka implementations evolve. But based on the design goals and proposed architecture, it is expected that higher end-to-end latency and greater variation will be the norm for Kafka Diskless.

Based on Aiven’s benchmark results, some early internal benchmarking we have performed, but mainly qualitative analysis of the available data—so the general observations should continue to hold as the KIP evolves—the following workloads may therefore prove to be a good fit:

Tolerant of higher and less predictable end‑to‑end latency
Seconds rather than milliseconds, with maxima of up to ~8 seconds. This includes producer latency, record batching, metadata operations (for both writes and reads), and remote storage I/O.
Tolerant of higher producer latency
Seconds (up to ~2.5s), rather than milliseconds, i.e. “slow producers”, and therefore the ability—or requirement—to scale producer concurrency.
Tolerant of higher consumer latency
Hundreds of milliseconds, i.e. “slow consumers”.
Support for many partitions and topics
Elastic or spiky producer workloads

These assumptions are based on a desire to reduce inter‑AZ networking costs (assuming your cloud provider charges for these) and a Kafka cluster deployed across, for example, three availability zones, with producers and consumers in each AZ.

Let’s explore these potential diskless workloads in more detail.

When to use Kafka Diskless Topics

Increased end‑to‑end latency

One of the main trade‑offs with Diskless Kafka, compared to classic Kafka with broker‑based replication, is significantly higher end‑to‑end latency—from record production to consumption—measured in seconds rather than milliseconds.

As a result, suitable workloads must tolerate higher and less predictable latency. Hard or soft real-time use cases, where near‑immediate action is required, are therefore a poor fit. Better matches include use cases where processing can safely occur eventually, minutes or more after data generation, or where deferred processing is normal (for example, replay, or pipelines into downstream systems such as Iceberg).

This is still streaming data, and standard Kafka ordering and delivery guarantees continue to apply—but at much higher latencies.

Some concrete examples of this use case are telemetry ingestion for long-term analytics, and audit, compliance, and security logging. We based these insights on early benchmark data and provisional setups, so you can expect that actual performance may change as Diskless Topics mature and official Apache Kafka implementations evolve.

Slow producers

The diskless architecture fundamentally changes the write path. Writes to any partition can be handled by any broker, eliminating inter‑AZ producer traffic and improving scalability. However, durability now requires record batching, batch coordination, and writes to remote storage.

Assuming acks=1, producers receive acknowledgements only after the remote storage write succeeds. As a result, producer write latency increases substantially—seconds, up to ~2.5s, rather than low milliseconds.

This introduces a new normal: slow producers are expected, not an exceptional condition caused by broker overload.

To achieve throughput comparable to, or even exceeding, classic Kafka, producers must increase concurrency—more producers or more producer threads. Diskless Topics therefore suit workloads that already have high producer fan-out (for example, IoT or large fleets of cloud services) or can easily scale producer instances.

Concrete examples of use cases for slow producers are edge IoT gateways or sensors, mobile or embedded systems, and microservices emitting asynchronous events.

Slow consumers

Slow consumers are a well-known pattern in Kafka, typically caused by expensive record processing or insufficient consumer parallelism. In Diskless Kafka, however, slow consumption is primarily due to higher read latency from remote storage (assuming records are not present in broker caches).

Consumer read latency may be hundreds of milliseconds, requiring more consumers—and therefore more partitions—to maintain overall throughput.

As with slow producers, suitable use cases must be able to scale consumer concurrency and provide sufficient compute resources.

Concrete slow consumer examples are Kafka as a buffer, and enrichment or ETL pipelines.

Many partitions and topics

Classic Kafka achieves durability through replication, typically using a replication factor of three. Records are written to a leader and replicated to two followers, often across availability zones. Replication overhead increases with both partition count and throughput.

Diskless Kafka radically changes this model. There are no leaders or followers in the data path. Any broker can accept writes to any partition, and records are stored durably only in remote storage. Local disk is used solely for ephemeral caching.

This removes the substantial replication overhead present in classic Kafka (e.g. see this blog, updated here), making it possible to support substantially higher partition counts without overloading brokers. In practice, higher partition counts may be required to achieve high throughput due to batching behaviour.

While many topics and partitions are often considered an anti‑pattern in classic Kafka, there are use cases where they are beneficial, and Diskless Topics may be a good fit.

Kafka ordering is guaranteed per partition, and consumers can be configured to read from individual partitions. More partitions therefore enable fine‑grained ordering domains and selective replay. Typical examples include per‑customer, per‑sensor, per‑account, or per‑session partitioning.

Higher partition counts also help isolate failures (such as poison pill records) and mitigate skewed keys that would otherwise create hot partitions.

More topics—resulting in more partitions—can also support:

Finer‑grained ownership and access control (ACLs are applied at the topic level)
Multiple retention policies (retention is configured per topic)
Multiple schemas (schemas are typically per topic)
Kafka Streams applications, which create multiple internal and state‑store backing topics

Note 1:
The current Diskless Inkless implementation uses PostgreSQL for batch metadata coordination. It is not yet clear whether PostgreSQL will be part of any future Apache Kafka Diskless implementation, but whatever diskless coordination mechanism is chosen must scale with increasing partition counts.

Note 2:
High partition counts require sufficiently high key cardinality (if you rely on the default key partitioner, alternatively, the producer can write to explicit partitions). My rule of thumb is at least 30× more distinct keys than partitions, to avoid partition starvation caused by hashing collisions (See “Knuth’s parking problem”). For example, 100,000 partitions imply a minimum of roughly 3 million distinct keys.

Spiky or fluctuating workloads

Perhaps the most significant outcome of the diskless architecture is that Kafka brokers become effectively stateless. This enables elastic autoscaling, allowing brokers to scale up or down in response to gradual or sudden workload changes.

This makes Kafka a more effective buffer for spiky workloads—short, irregular bursts that exceed baseline load—and reduces the need to provision permanently oversized clusters. Costs can be reduced by scaling down during off-peak periods.

Assuming effective autoscaling, broker overload should no longer cause additional end-to-end latency, partially compensating for the higher baseline latency of diskless operation. However, other bottlenecks must still be understood and monitored carefully, including the diskless coordinator and the performance variability of cloud object storage.

Some concrete bursty workload examples are flash sales, incident storms, marketing campaigns, system retries after outages, and migration and backfill pipelines, which are often short-lived but very bursty!

When not to use Kafka Diskless Topics

While Diskless Topics introduce compelling benefits—particularly around elasticity and durability—they are not a universal replacement for classic Kafka. The following workloads would likely be a poor fit, typically representing the inverse of the characteristics described above.

Low and predictable end-to-end latency requirements

Workloads that require consistently low latency—milliseconds rather than seconds—are not well suited to diskless topics.

This includes “real-time” or near-real-time systems where immediate action is required, such as user-facing applications, fraud detection, synchronous request/response pipelines, or operational alerting systems. In these scenarios, both the higher baseline latency and the increased variability violate the real-time non-functional requirements.

Latency-sensitive or synchronous producers

Applications that depend on fast producer acknowledgements, or that are tightly coupled to downstream processing, will struggle with diskless topics.

Producer latency of seconds—rather than milliseconds—can significantly impact user experience, throughput, or system design, particularly where:

Producers operate in synchronous workflows
Backpressure cannot be easily absorbed
Increasing producer concurrency is not feasible

Examples include transactional systems, APIs that synchronously publish events as part of a request cycle, or tightly coupled microservices, or resource-constrained production where the number of producers cannot be increased.

These patterns assume “fast write” semantics that diskless architectures intentionally trade off.

Latency-sensitive or tightly coupled consumers

Workloads where consumers must process events immediately after they are produced are also a poor fit.

Diskless topics introduce additional read latency, especially when data must be retrieved from remote storage rather than the broker cache. This makes them unsuitable for:

Real-time dashboards or monitoring systems
Event-driven architectures with strict end-to-end SLAs
Feedback loops (for example, control systems or adaptive decision making)
Resource-constrained consumption where the number of consumers cannot be increased.

In such systems, delays of even hundreds of milliseconds—or occasional seconds—can accumulate and cause issues, including errors and a loss of responsiveness.

Small numbers of partitions with modest scale

Diskless topics require more partitions to scale but also provide better scalability for larger numbers of partitions. They may therefore not scale, and offer limited benefits, for workloads with smaller numbers of partitions. If your use case:

Requires only a small number of partitions
Does not suffer from replication overhead (typical with RF=3 and many partitions)
Does not need fine-grained ordering domains or isolation

Then classic Kafka is often simpler, more predictable, and more cost-efficient. In particular, workloads with low throughput or limited parallelism may not benefit from the architectural trade-offs inherent in diskless designs.

Stable, predictable workloads with minimal burstiness

Workloads that are stable and predictable gain little from the potential for Diskless Kafka to auto-scale in response to fluctuating demand.

If your workload:

Has steady, well-understood throughput
Is already efficiently provisioned
Does not require rapid scale-up or scale-down

Then the operational and architectural changes introduced by diskless topics may not justify the benefits.

In essence, Kafka Diskless Topics are not a direct replacement for classic Kafka—they represent a different set of trade-offs. If your workload prioritises low latency, tight coupling, predictability over elasticity and cost optimisation, or resource-constrained clients, then traditional Kafka architectures remain the better fit.

Here’s a table that summarises the use cases that best fit “Classic Kafka” and “Diskless Topics”:

Use case	Classic Kafka	Diskless Topics
End to end latency	Fast predictable	Slow unpredictable
Producers	Fast	Slow high concurrency
Consumers	Fast	Slow high concurrency
Partitions	Less	More
Workload	Stable	Bursty

Classic workloads are fast and predictable, with stable workloads and fewer partitions. Diskless workloads are slower and unpredictable, require high producer and consumer concurrency and resources, but support more partitions and bursty workloads.

Final thoughts on Kafka Diskless Topics

Kafka Diskless Topics work best for workloads where durability and elasticity matter far more than achieving low-latency processing.

Further tuning and optimization opportunities will undoubtedly emerge as official Apache Kafka implementations become widely available and operational experience grows. We must further investigate consumer throughput impacts, as they may constrain certain use cases to environments that do not require continuous, low-latency consumption.

Kafka Diskless Topics relate closely to Kafka Tiered Storage, particularly on the read path. As a result, they likely share many of the same use cases, including buffering, disconnected consumers, large-scale replay, and seamless cluster migrations.

In the meantime, you can try out our managed Apache Kafka with Tiered Storage. This allows you to explore the immediate impact and benefits of storing most of your data on remote cloud storage. All records, other than those in active segments, automatically upload to cloud storage and become the only copy once they exceed the local segment retention threshold.