For the latest stable version, please use Korvet 0.12.5!

Metrics Reference

Korvet exposes metrics in Prometheus format via Spring Boot Actuator.

Korvet Custom Metrics

Produce Metrics

Metric Description Tags

korvet.produce.messages

Number of messages produced

topic, partition

korvet.produce.latency

Produce request latency (histogram)

topic, partition

Fetch Metrics

Metric Description Tags

korvet.fetch.requests

Number of fetch requests

topic, partition

korvet.fetch.latency

Fetch request latency (histogram)

topic, partition

korvet.fetch.messages

Number of messages fetched

topic, partition, tier

Error Metrics

Metric Description Tags

korvet.errors

Number of errors by operation and type

operation, error_type

JVM Metrics

Standard JVM metrics from Micrometer:

Metric Description

jvm.memory.used

JVM memory used

jvm.memory.max

JVM maximum memory

jvm.gc.pause

Garbage collection pause time

jvm.threads.live

Live threads

process.cpu.usage

Process CPU usage

System Metrics

Metric Description

system.cpu.usage

System CPU usage

system.load.average.1m

System load average (1 minute)

disk.free

Free disk space

disk.total

Total disk space

Querying Metrics

Prometheus Queries

Korvet Metrics

Messages produced per topic:

sum by (topic) (korvet_produce_messages_total)

Produce latency (p99) per topic:

histogram_quantile(0.99, sum by (topic, le) (rate(korvet_produce_latency_seconds_bucket[5m])))

Fetch latency (p95) per topic:

histogram_quantile(0.95, sum by (topic, le) (rate(korvet_fetch_latency_seconds_bucket[5m])))

Messages fetched by storage tier:

sum by (tier) (korvet_fetch_messages_total)

Error rate by operation:

sum by (operation) (rate(korvet_errors_total[5m]))

JVM Metrics

JVM memory usage:

jvm_memory_used_bytes{area="heap"}

GC pause time (p99):

histogram_quantile(0.99, jvm_gc_pause_seconds_bucket)

CPU usage:

process_cpu_usage