Kafka® Streams error handling, Part 2: How to catch and fix bad records with KIP-1034 and LLMs

This “Apache Kafka Streams news processing application with KIP-1034” blog series is split into two parts.

Part 1 focused on the architecture: the data model, error scenarios, and the Kafka Streams topology that connects producers, DLQs, and repair loops.

Part 2 covers the mechanics of failure handling. We’ll show you how failures are detected, routed, and repaired using KIP-1034 and LLMs. You can find the complete demo Java Kafka Streams processing code, including documentation and setup for the LLMs and instructions for running and sample output, in this GitHub.

Comparing the two failure modes in an Apache Kafka Streams application

Kafka Streams handles two distinct failure types. Some records fail before they ever enter the topology. Others flow through the topology but fail business validation.

In the demo, producers deliberately send two types of “bad” records so we can see how failures are handled and who is responsible for fixing them.

At a high level:

Some records fail before they even enter the topology
Others flow through the topology but fail business validation

	Free-form prose (not JSON)	JSON with empty categories
What breaks	The bytes are not JSON at all, so they cannot become a NewsItem.	Jackson succeeds: the payload is valid NewsItem JSON, but `hasAnyCategory()` is false.
Where it’s detected	In `JsonNewsItemSerde` during deserialization, before any KStream transform runs.	After deserialization, in `NewsFreeformTopology` on the merged `NewsItem` stream (split branch).
How it’s “caught”	The deserializer throws `SerializationException`. Streams invokes `LogAndContinueExceptionHandler`, which (with `errors.dead.letter.queue.topic.name`) produces to `demo-news-ff-streams-dlq`. The read offset still advances, so the app keeps running.	No throw at the edge: the record is a normal NewsItem. The topology `mapValues` the row into a DLQ envelope string and to `demo-news-ff-app-dlq`.
What lands on the DLQ	Raw bytes (same as on the wire) plus `__streams.errors.*` headers (exception class, source `topic/partition/offset`, etc.).Application-defined JSON (`reason`, `articleId`, `rawJson`, `detail`)—easy for a dedicated consumer to parse.	Application-defined JSON (reason, articleId, rawJson, detail)—easy for a dedicated consumer to parse.
Who repairs it	`NewsFreeformKip1034RepairMain: ByteArrayDeserializer`, interpret UTF-8, `structurePlainTextToNewsItemJson` (or fallbacks), produce to `demo-news-ff-repaired`.	`NewsFreeformAppDlqRepairMain: parse envelope`, `inferCategoriesJson` (or heading-token fallback), produce to `demo-news-ff-repaired`.
What’s the point?	KIP-1034 / framework DLQ: serde and wire-format problems are handled outside the map/branch code; you need a consumer that understands bytes + error headers.	Application DLQ: your business rules after a successful parse; same Kafka machinery for produce/consume, but you chose the envelope schema and the policy.

Why include both approaches in the demo? Because there are two distinct ways of detecting and handling bad records:

The KIP-1034 dead letter que (DLQ) works at the framework-level. It handles technical failures like deserialization and wire format issues. You get raw bytes + metadata, but it requires a specialized repair consumer.

The application DLQ works at the business level. It handles domain validation failures. You define the schema and rules, and it’s easier to process and repair.

A third type of record, fully valid JSON with categories, flows through the system normally and exercises only the “happy path”.

How the KIP-1034 Dead Letter Queue catches failures

Let’s recap. How does KIP-1034 catch failures? What sorts and how?

KIP‑1034 doesn’t introduce new failure detection. Instead, it hooks into the existing Kafka Streams error detection points and changes what happens after a failure is detected.

Here’s the key idea

Detection = existing exception handling system
KIP‑1034 = standardized response (DLQ + resume/fail)

Before KIP‑1034, Kafka Streams already detected failures in three places.

Catching a Kafka deserialization error: When reading records from Kafka topics, this phase catches failures like an invalid schema or corrupted bytes. The DeserializationExceptionHandler handles it.

Stream processing exception handling: Caught by the topology logic, this phase catches failures like a NullPointerException or business logic errors. The ProcessingExceptionHandler handles it (introduced in KIP-1033).

The production phase: When writing results back to Kafka topics, this phase catches failures like serialization issues or broker errors. The ProductionExceptionHandler handles it.

These handlers already existed (or were completed with KIP‑1033) and are the actual detection points, and whenever an exception is thrown at one of these stages, Streams considers the record “failed”.

KIP‑1034 adds a new unified response model when a failure is detected. Previously, an exception produced a log (which skipped the failed record) or crashed the streams application. Now, with KIP-1034, an exception is handled, produces a structured response, and can either fail or resume (skipping the record and continuing); and (optionally) sending the record and metadata to a DLQ for repair.

In this demo, we use deserialization failures (freeform text routed to a DLQ) and processing-time validation (missing categories routed to an application DLQ), but not production failures.

Where LLM streaming data repair fits in

LLMs are used in three very specific places, always outside the main streaming path. This is intentional. LLMs are slower and less predictable than stream processing, so they are only used for data generation, repair, and enrichment.

What model, what API Power the LLM Streaming Data Repair?

I used Meta’s Ollama Llama 3 LLM. Everything goes through local Ollama over HTTP: POST. The code does not hard‑wire external models. There are three LLM jobs, which means three prompts in NewsLlmClient:

LLM news generation: `generateSyntheticArticles`

Used only by NewsFreeformRandomLlmProducerMain and NewsFreeformStreamingLlmProducerMain. Asks for a JSON array of fictional news NewsItems (batch size capped at 40 per call), optional theme from env/CLI.

Prompt:

Let n = bounded count. Let theme = hint or default mixed beats worldwide (science, business, local politics, sport, culture, environment).

You invent plausible but fictional news briefs for a Kafka Streams QA demo. Output ONLY a JSON array of exactly <n> objects. No markdown fences, no text before or after the array. Each object MUST have keys: articleId (string, unique across the array), heading (one line, <=120 chars), article (1–4 sentences), categories (array of 1–5 short topical English tags; invent tags freely, no fixed list). Vary tone and region; avoid repeating the same headline pattern. Bias topics toward: <theme>.

LLM repair – structure Strings to JSON: structurePlainTextToNewsItemJson

Used by NewsFreeformKip1034RepairMain. Reads KIP-1034 DLQ value bytes as UTF-8 prose (or junk), asks the model to return one JSON object shaped like a NewsItem (articleId, heading, article, categories). Read timeout 240s. Output is stripped of markdown fences and the leading JSON is isolated before parse.

Signature: structurePlainTextToNewsItemJson(articleIdHint, plainText).

Prompt:

idRule is either:

If articleIdHint blank: Use articleId as a short unique string (e.g. gen- plus random alphanumeric).

Else: Use exactly this articleId string: + hint.

You structure news text for a Kafka Streams pipeline. Output ONLY one JSON object with keys articleId (string), heading (one line, <=120 chars), article (the story; you may lightly edit the source text for clarity), categories (array of 1-5 short topical English tags).

<idRule> Your entire reply MUST start with { as the first non-whitespace character — no preamble, no markdown fences, no “Here is” text.

Source text: <plainText>

LLM enrichment – infer missing categories: inferCategoriesJson

Used by NewsFreeformAppDlqRepairMain for MISSING_CATEGORIES (and a BAD_JSON branch if you ever feed that envelope). Sends the current NewsItem JSON and asks for the same four keys back with categories filled (1–6 tags).

Prompt:

Let input = NewsJson.stringify(item) (full four-key JSON).

You label news for downstream topic grouping. Given JSON with keys articled, heading, article, categories (array, may be empty). Output ONLY one JSON object with the SAME four keys. Fill categories with 1 to 6 short topical tags in English (nouns or short phrases, lowercase is fine). Do not copy the heading verbatim as the only tag. No markdown, no commentary.

Input: <input>

What is not LLM‑driven? Not everything relies on the LLMs. The producer for JSON with missing categories only produces one identical fixed news item about an incident at a museum involving escaping ping-pong balls! And repairs do not require the LLM to succeed or be available – there are fall-backs.

What this means for streaming plus AI

This demo brings together Kafka Streams, KIP-1034, and GenAI/LLM news generation with two types of detection and repair. One that uses KIP-1034 and the other that relies on Kafka Streams itself—and it works!

Streaming Kafka data plus AI is perhaps a more viable combination than you might expect, with real potential for more use cases and applications in the future. Here are some AI-generated ideas that fit well at the intersection of Kafka Streams (streaming data, stateful, ordered, scalable consumers; joins, windows, exactly/at-least-once) and LLMs (messy text in/out, judgment calls, formatting):

Routing and classification: dynamically branch streams based on meaning
Windowed summaries: generate human-readable reports from event batches
Schema repair: fix evolving or inconsistent producer formats
Personally Identifiable Information (PII) detection/redaction: identify sensitive data before storage
RAG pipelines: streaming ingestion + embedding + indexing
Conversational backends: combine session state with LLM responses
Anomaly explanation: attach natural-language explanations to detected events
Localization: generate region-specific variants of messages
Synthetic data generation: create realistic demo or test streams

And the “pièce de resistance” (maybe) is a possible extension I have in mind: using Kafka Streams plus Vector Search plus AI to detect “breaking news” stories!

Final thoughts

Not all of these use cases will be practical. Latency, scalability, cost, and LLM variability are real constraints worth taking seriously.

But this demo shows something important. Streaming systems and LLMs don’t have to compete. They can complement each other. Kafka handles scale, ordering, and state. LLMs handle ambiguity, interpretation, and repair. Used together, they open up new streaming plus AI design patterns that are worth exploring.

So what bad record will you fix first? (Note: you’ll need Kafka 4.2.0 or above).

You can test out KIP-1034 for development in the open source version of Kafka 4.2.0 available for download here, and use it in production when it becomes available and supported on our managed Kafka platform in the future.

You can also explore our managed Apache Kafka offerings and try a free trial today to experience stress-free data streaming and innovations such as this in the future.

Kafka® Streams error handling, Part 2: How to catch and fix bad records with KIP-1034 and LLMs

Comparing the two failure modes in an Apache Kafka Streams application

How the KIP-1034 Dead Letter Queue catches failures

Where LLM streaming data repair fits in

What model, what API Power the LLM Streaming Data Repair?

LLM news generation: generateSyntheticArticles

LLM repair – structure Strings to JSON: structurePlainTextToNewsItemJson

LLM enrichment – infer missing categories: inferCategoriesJson

What this means for streaming plus AI

Final thoughts

About the author

Get the latest articles for open sourceIn your inbox

Sign upto ourNewsletter

LLM news generation: `generateSyntheticArticles`

Get the latest articles for open source
In your inbox

Sign up
to our
Newsletter