Consumer Group Metrics
The Consumer Group Metrics group will contain the metrics collected on Kafka consumer groups that are live and consuming from topics. The metrics available are
- Consumer Count per Client
- Consumer Lag per Client
- Partition Count per Client
Before diving into the individual metrics note that all metrics are collected on a client level within a consumer group separated by the consumed topic. The metrics will be available only if Kafka is used as the consumer offset store.
A consumer group client is a logical grouping defined by setting the configuration property client.id when connecting to a Kafka cluster. Most kafka consumer libraries support this configuration.
Consumer Count per Client
This is the count of the consumers with the same clientID within the consumer group consuming from a particular topic. This metric may be useful to diagnose if the consumer group is healthy and all consumers are alive and consuming from the topic as expected.
Consumer Lag per Client
This is the sum of consumer lags for a topic grouped by the client ID within a consumer group. Consumer lag is defined as the difference between the last committed offset by a consumer and the log end offset for a particular partition.
This metric may be useful for diagnosing consumer latency issues and pinpoint the clients that are lagging behind. High consumer lag overall will also indicate that the consumers are not able to keep up with the producer throughput which can be handled by either upscaling the consumers or introducing more consumers to the group.
Partition Count per Client
This is the count of the partitions assigned to consumers with the same client ID within a consumer group consuming from a particular topic. This metric may be useful to diagnose if the clients have distributed the workload in an appropriate way. A high count of partitions on a single client compared to others might indicate that the workload is skewed, reducing performance if all clients have similar computation capabilities.