NetApp, a leading provider of fully managed open source data and workflow application solutions, is introducing a significant enhancement to its managed service Instaclustr for Apache Kafka®. This new feature called Client Telemetry is designed to deliver broker-integrated visibility into Kafka client and application-level metrics, empowering users with telemetry export and centralized collection.

Apache Kafka is the backbone of countless real-time data pipelines and streaming applications. As these environments grow in complexity and scale, the ability to effectively monitor the health and performance of every component, including the clients interacting with the Kafka brokers, becomes ever more important. With Client Telemetry, Instaclustr for Apache Kafka users can now gain immediate visibility into client behaviour such as connection status, request rates, error rates, and latency directly from the broker, designed to significantly simplify the monitoring setup and provide a holistic view of client interactions.

With this new feature, compliant Kafka clients (see our support page for more details) collect their metrics and push them to the Kafka brokers. Brokers collect these metrics and using an OpenTelemetry Collector, push these to a customer specified OpenTelemetry supporting destination. For this initial release, Prometheus 3.0+ and Datadog are supported as end destinations. Alternative destinations like ClickHouse, Splunk, and others could be supported in the future. If you require this, please let us know.

Here is a simplified architectural diagram showing the various components:

Kafka client telemetry data flow

The newly introduced capabilities revolve around providing more granular and actionable metrics directly from the Kafka brokers:

  • Centralized client metrics with KIP-714: Traditionally, gathering detailed metrics from Kafka producers, consumers, and admin clients required separate instrumentation and collection mechanisms within each client application. KIP-714 drastically simplifies this by enabling Kafka brokers to centrally track a wide array of client-side metrics on behalf of the clients.
  • Simplified configuration management with KIP-1000: Supporting the centralized collection of client metrics, KIP-1000 introduces standardized mechanisms to list the configuration of these client metrics resources. This enables consistent configuration across large fleets.
  • Extended observability to application-level metrics with KIP-1076: Building on KIP-714, KIP-1076 extends monitoring capabilities by allowing client applications (such as those with embedded client instances like Kafka Streams) to register and expose their own custom metrics through the same centralized Kafka metrics reporting system. With this, custom application health indicators can be monitored alongside standard Kafka client metrics, providing a complete end-to-end view of the streaming pipeline’s performance. This is particularly useful for Kafka Streams applications.

The integration of these KIPs into the Instaclustr for Apache Kafka managed service offers several key advantages:

  • Enhanced observability with faster troubleshooting: Gain a much deeper understanding of how your Kafka clients and applications are interacting with your Kafka clusters allowing you to accelerate root cause analysis by correlating client, broker, and application metrics within a unified view.
  • Proactive and reactive client monitoring: Identify and diagnose client-side or application-specific problems more quickly by having access to detailed standardized metrics with directly from the brokers.
  • Simplified monitoring architecture: Reduce the complexity of your monitoring stack by centralizing client and application metric collection through Kafka itself.

“At Instaclustr, we are committed to providing our customers with a 100% open source, production-ready, and easy-to-operate data infrastructure. The addition of these advanced client and application metrics capabilities aligns perfectly with our mission. It empowers our users with the client-side visibility they need to confidently run and scale their mission-critical Kafka workloads.”

—Ben Slater, VP & GM of Instaclustr

The feature (KIP-714 and KIP-1000) is available on our managed platform for customers running Kafka 3.9.x while incremental enhancements (these introduced with KIP-1076) to support application-level metrics are available starting Kafka 4.0. Details on how to enable and configure it for new and existing clusters are available on our dedicated support page. If you need any help, Instaclustr’s expert support team is also available 24×7 to assist.