The worker rebalance metrics group contains per-worker metrics relating to the rebalancing of tasks. The available metrics are:
- Completed rebalances
- Time since last rebalance
- Current Epoch
- Leader State
See also the Kafka Connect metrics available in the monitoring api.
The completed rebalances shows the number of rebalancing operations each worker has gone through since startup. This metric is helpful in identifying when changes to your clusters task distribution and other events are occurring.
Time since last rebalance
Time since last rebalance shows the time in milliseconds since the last time each worker participated in a rebalance. Consistently low numbers or “sawtooth” like behaviour here can indicate instability or a high degree of change in your cluster.
The current epoch metric shows a number that indicates which layout of tasks each worker is operating under. This metric should increase monotonically and be in sync across all the nodes in your cluster. A long term discrepancy in current epoch between nodes indicates that your nodes are not communicating with each other properly.
The leader state shows which worker node believes it is the leader for this Kafka Connect cluster. Only one node should be the leader at any one time. Multiple leaders indicates that your nodes are not communicating with each other properly.