Kafka Connect Metrics

Menu

Authentication

All requests to the API must use Basic Authentication and contain a valid username and monitoring API key. API keys are created per user account and can be retrieved via the Instaclustr Console from the Account > API Key tab.

Requests

Metrics are requested by constructing a GET request, consisting of:

  • type: Either ‘clusters’, ‘datacentres’ or ‘nodes’.
    • ‘clusters’ returns the metrics for each node in the cluster.
    • ‘datacentres’ returns the metrics for each node belonging to the datacenter.
    • ‘nodes’ returns the metrics for a specific node.
  • UUID or public IP: If the type is set to ‘clusters’ or ‘datacentres’, then the UUID of cluster or datacentre must be specified. However, if the type is set to ‘nodes’, then either the nodes’ UUID or public IP may be specified.
  • metrics: The metrics to return are specified as a comma-delimited query string parameter. Up to 20 metrics may be specified.
  • reportNaN: (true|false) If a metric value is NaN or null, reportNaN determines whether API should report it as NaN. The default behaviour is false and NaN and null will be reported as 0. Setting ‘reportNaN=true’ will return NaN values in the API response.

Metrics

Kafka Connect specific metrics in the monitoring API begin with the kc:: prefix, ie. kc::connectorCount.

The currently available metrics are:

  • Worker Metrics
    • kc::epoch: Monotonically increasing number that indicates the current state of assigned tasks. Will increase by one for each completed rebalance
    • kc::completedRebalancesTotal: Number of rebalances that have completed since Kafka Connect has started (per node)
    • kc::connectorCount: Number of connectors currently assigned to each worker node
    • kc::connectorStartupAttemptsTotal: Number of times a connector has been instructed to start on each worker node
    • kc::connectorStartupFailurePercentage: Percentage of connecter start-up attempts that have failed to complete
    • kc::connectorStartupFailureTotal: Number of times a connector has been instructed to start and failed to do so
    • kc::connectorStartupSuccessPercentage: Percentage of connecter start-up attempts that have successfully completed
    • kc::connectorStartupSuccessTotal: Number of times a connector has been instructed to start and has succeeded in doing so
    • kc::latencyMedianMs: The time taken from a record being produced on the connected Kafka Cluster to it being read on the Kafka Connect cluster. Measured using synthetic messages. Only available if attached to an Instaclustr managed Kafka cluster.
    • kc::latencyRecordsProcessed: The number of messages processed to produce the latencyMedianMs measure. Only available if attached to an Instaclustr managed Kafka cluster.
    • kc::leaderName: Identity of the current leader worker node. Typically this is the IP address of the leader.
    • kc::rebalanceAvgTimeMs: The average time each rebalance has taken to complete (per node, in milliseconds)
    • kc::rebalanceMaxTimeMs: The maximum time each rebalance has taken to complete (per node, in milliseconds)
    • kc::rebalancing: Whether or not the worked is currently rebalancing (per node)
    • kc::restApiAvailable: Whether or not the Kafka Connect REST API is currently available
    • kc::taskCount: Number of tasks currently assigned to each worker node
    • kc::taskStartupAttemptsTotal: Number of times a task has been instructed to start on each worker node
    • kc::taskStartupFailurePercentage: Percentage of task start-up attempts that have failed to complete
    • kc::taskStartupFailureTotal: Number of times a task has been instructed to start and failed to do so
    • kc::taskStartupSuccessPercentage: Percentage of task start-up attempts that have successfully completed
    • kc::taskStartupSuccessTotal: Number of times a task has been instructed to start and has succeeded in doing so
    • kc::timeSinceLastRebalanceMs: Time since the last successful rebalance that each node participated in (per node, in milliseconds)
  • Task General Metrics (When requesting these will be only reported by nodes who owns tasks)
    • kct::<connector-name>::<task-id>::batchSizeAvg: The average size of the batches processed by the connector.
    • kct::<connector-name>::<task-id>::offsetCommitAvgTimeMs: The average time in milliseconds taken by this task to commit offsets.
    • kct::<connector-name>::<task-id>::offsetCommitFailurePercentage: The average percentage of this task’s offset commit attempts that failed.
    • kct::<connector-name>::<task-id>::pauseRatio: The fraction of time this task has spent in the pause state.
    • kct::<connector-name>::<task-id>::status: The status of the connector task. One of ‘unassigned’, ‘running’, ‘paused’ or ‘failed’.
  • Task Error Metrics (When requesting these will be only reported by nodes who owns tasks)
    • kct::<connector-name>::<task-id>::deadletterqueueProduceFailures: The number of failed writes to the dead letter queue.
    • kct::<connector-name>::<task-id>::deadletterqueueProduceRequests: The number of attempted writes to the dead letter queue.
    • kct::<connector-name>::<task-id>::lastErrorTimestamp: The epoch timestamp when this task last encountered an error.
    • kct::<connector-name>::<task-id>::totalErrorsLogged: The number of errors that were logged.
    • kct::<connector-name>::<task-id>::totalRecordErrors: The number of record processing errors in this task.
    • kct::<connector-name>::<task-id>::totalRecordFailures: The number of record processing failures in this task.
    • kct::<connector-name>::<task-id>::totalRecordsSkipped: The number of records skipped due to errors.
    • kct::<connector-name>::<task-id>::totalRetries: The number of operations retried.
  • Sink Task Metrics (When requesting these will be only reported by nodes who owns tasks)
    • kct::<connector-name>::<task-id>::offsetCommitCompletionRate: The average per-second number of offset commit completions that were completed successfully.
    • kct::<connector-name>::<task-id>::offsetCommitCompletionTotal: The total number of offset commit completions that were completed successfully.
    • kct::<connector-name>::<task-id>::offsetCommitSeqNo: The current sequence number for offset commits.
    • kct::<connector-name>::<task-id>::offsetCommitSkipRate: The average per-second number of offset commit completions that were received too late and skipped/ignored.
    • kct::<connector-name>::<task-id>::offsetCommitSkipTotal: The total number of offset commit completions that were received too late and skipped/ignored.
    • kct::<connector-name>::<task-id>::partitionCount: The number of topic partitions assigned to this task belonging to the named sink connector in this worker.
    • kct::<connector-name>::<task-id>::putBatchAvgTimeMs: The average time taken by this task to put a batch of sinks records.
    • kct::<connector-name>::<task-id>::sinkRecordActiveCount: The number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task.
    • kct::<connector-name>::<task-id>::sinkRecordActiveCountAvg: The average number of records that have been read from Kafka but not yet completely committed/flushed/acknowledged by the sink task
    • kct::<connector-name>::<task-id>::sinkRecordReadRate: The average per-second number of records read from Kafka for this task belonging to the named sink connector in this worker. This is before transformations are applied.
    • kct::<connector-name>::<task-id>::sinkRecordReadTotal: The total number of records read from Kafka by this task belonging to the named sink connector in this worker, since the task was last restarted.
    • kct::<connector-name>::<task-id>::sinkRecordSendRate: The average per-second number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.
    • kct::<connector-name>::<task-id>::sinkRecordSendTotal: The total number of records output from the transformations and sent/put to this task belonging to the named sink connector in this worker, since the task was last restarted.
  • Source Task Metrics (When requesting these will be only reported by nodes who owns tasks)
    • kct::<connector-name>::<task-id>::pollBatchAvgTimeMs: The average time in milliseconds taken by this task to poll for a batch of source records.
    • kct::<connector-name>::<task-id>::sourceRecordActiveCount: The number of records that have been produced by this task but not yet completely written to Kafka.
    • kct::<connector-name>::<task-id>::sourceRecordActiveCountAvg: The average number of records that have been produced by this task but not yet completely written to Kafka.
    • kct::<connector-name>::<task-id>::sourceRecordPollRate: The average per-second number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.
    • kct::<connector-name>::<task-id>::sourceRecordPollTotal: The total number of records produced/polled (before transformation) by this task belonging to the named source connector in this worker.
    • kct::<connector-name>::<task-id>::sourceRecordWriteRate: The average per-second number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker. This is after transformations are applied and excludes any records filtered out by the transformations.
    • kct::<connector-name>::<task-id>::sourceRecordWriteTotal: The number of records output from the transformations and written to Kafka for this task belonging to the named source connector in this worker, since the task was last restarted.
  • Connect Metrics (When requesting these will be only reported by nodes who owns the connectors)
    • kcc::<connectorName>::connectorUnassignedTaskCount: This is only available for Kafka Connect 2.5.1+.
    • kcc::<connectorName>::connectorTotalTaskCount: The total number of tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.
    • kcc::<connectorName>::connectorRunningTaskCount: The number of running tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.
    • kcc::<connectorName>::connectorDestroyedTaskCount: The number of running tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.
    • kcc::<connectorName>::connectorFailedTaskCount: The number of failed tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.
    • kcc::<connectorName>::connectorPausedTaskCount: The number of paused tasks assigned to the connector. This is only available for Kafka Connect 2.5.1+.
  • Mirroring Source Connector Metrics (When requesting these will be only reported by nodes who owns the connectors, and have been aggregated at a topic level per worker)
    • kc::mm::source::<target>::<topic-name-in-target>::recordCount : number of records replicated by the mirroring source connector
    • kc::mm::source::<target>::<topic-name-in-target>::byteCount : byte count replicated by the mirroring source connector
    • kc::mm::source::<target>::<topic-name-in-target>::recordRate : record replication rate of the mirroring source connector
    • kc::mm::source::<target>::<topic-name-in-target>::byteRate : byte replication rate of the mirroring source connector
    • kc::mm::source::<target>::<topic-name-in-target>::recordAgeMs : age of each record at the time when consumed by the mirroring source connector
    • kc::mm::source::<target>::<topic-name-in-target>::replicationLatencyMs : timespan between each record’s timestamp and downstream acknowledgment
  • Mirroring Checkpoint Connector Metrics (When requesting these will be only reported by nodes who owns the connectors, and have been aggregated at a topic level per worker)
    • kc::mm::checkpoint::<source>::<target>::<group>::<topic-name-in-target>::checkpointLatencyMs : timestamp between consumer group commit and downstream checkpoint acknowledgment

Example: Endpoint to return the time since the last rebalance and whether or not the nodes are currently rebalancing for every node in the cluster  with a UUID of 7b58eae9-2b72-420a-a544-32a404b70fd7.

Response:

Need Support
Learn More

Already have an account?
Login to the Console

Experiencing difficulties on the website or console?
Status page for known incidents


Don’t have an account yet?
Sign up for a free trial

Why sign up?
To experience the ease of creating and managing clusters via the Instaclustr Console.