Metrics Reference

Korvet exposes metrics in Prometheus format via Spring Boot Actuator.

Korvet Custom Metrics

Produce Metrics

Metric Description Tags

korvet.produce.messages

Number of messages produced

topic, partition

korvet.produce.bytes

Number of bytes produced (ingress throughput)

topic, partition

korvet.produce.latency

Produce request latency (histogram)

topic, partition

Fetch Metrics

Metric Description Tags

korvet.fetch.requests

Number of fetch requests

topic, partition

korvet.fetch.latency

Fetch request latency (histogram)

topic, partition

korvet.fetch.messages

Number of messages fetched

topic, partition

korvet.fetch.bytes

Number of bytes fetched (egress throughput)

topic, partition

Error Metrics

Metric Description Tags

korvet.errors

Number of errors by operation and type

operation, error_type

Backpressure Metrics

Metric Description Tags

korvet.backpressure.applied

Number of times backpressure was applied to producer connections

none

korvet.backpressure.released

Number of times backpressure was released from producer connections

none

Backpressure metrics indicate flow control activity. High values suggest producers are sending data faster than the server can process. Consider tuning max-pending-bytes or scaling resources.

Storage Metrics

Storage metrics are only available when cold storage is enabled (korvet.storage.enabled=true).
Metric Description Tags

korvet.storage.archive.batch.time

Time to process a batch (read + write + commit)

stream

korvet.storage.archive.messages.read

Number of messages read from Redis stream

stream

korvet.storage.archive.messages.archived

Number of messages archived to Delta Lake

stream

korvet.storage.archive.batch.size

Number of messages per batch (distribution summary)

stream

korvet.storage.read.redis.time

Time to read from Redis stream (XREADGROUP)

none

korvet.storage.parquet.write.time

Time to write Parquet file

none

korvet.storage.parquet.bytes.written

Total bytes written to Parquet files

none

korvet.storage.parquet.file.size

Parquet file size in bytes (distribution summary)

none

korvet.storage.delta.commit.time

Time to commit to Delta Lake transaction log

none

korvet.storage.delta.commits

Number of Delta Lake commits

none

Lettuce Redis Client Metrics

Korvet can optionally enable Lettuce command latency metrics to track Redis operation performance.

These metrics are disabled by default. Enable them by setting korvet.redis.metrics.enabled=true in your configuration.
Metric Description Tags

lettuce.command.firstresponse

Time to first response from Redis (timer)

command, local, remote

lettuce.command.completion

Time to complete Redis command (timer)

command, local, remote

Tags:

  • command: Redis command name (e.g., GET, SET, XADD, XREAD)

  • local: Local socket address (if local-distinction enabled)

  • remote: Remote Redis server address

Configuration:

korvet:
  redis:
    metrics:
      enabled: true  # Enable Lettuce command metrics (default: false)
      histogram: false  # Enable histogram buckets for percentiles (default: false)
      local-distinction: false  # Track per connection vs per host (default: false)
      max-latency: 5m  # Maximum expected latency (default: 5 minutes)
      min-latency: 1ms  # Minimum expected latency (default: 1 millisecond)

Configuration Properties:

  • enabled (boolean, default: false): Enable Lettuce command latency metrics

  • histogram (boolean, default: false): Enable histogram buckets for aggregable percentile approximations

  • local-distinction (boolean, default: false): Track metrics per connection instead of per host/port

  • max-latency (duration, default: 5m): Maximum expected latency for histogram buckets (only applies when histogram is enabled)

  • min-latency (duration, default: 1ms): Minimum expected latency for histogram buckets (only applies when histogram is enabled)

JVM Metrics

Standard JVM metrics from Micrometer:

Metric Description

jvm.memory.used

JVM memory used

jvm.memory.max

JVM maximum memory

jvm.gc.pause

Garbage collection pause time

jvm.threads.live

Live threads

process.cpu.usage

Process CPU usage

System Metrics

Metric Description

system.cpu.usage

System CPU usage

system.load.average.1m

System load average (1 minute)

disk.free

Free disk space

disk.total

Total disk space

Querying Metrics

Prometheus Queries

Korvet Metrics

Messages produced per topic:

sum by (topic) (korvet_produce_messages_total)

Produce latency (p99) per topic:

histogram_quantile(0.99, sum by (topic, le) (rate(korvet_produce_latency_seconds_bucket[5m])))

Fetch latency (p95) per topic:

histogram_quantile(0.95, sum by (topic, le) (rate(korvet_fetch_latency_seconds_bucket[5m])))

Messages fetched by storage tier:

sum by (tier) (korvet_fetch_messages_total)

Error rate by operation:

sum by (operation) (rate(korvet_errors_total[5m]))

Backpressure events per minute:

rate(korvet_backpressure_applied_total[1m]) * 60

Ingress throughput (bytes/sec) per topic:

sum by (topic) (rate(korvet_produce_bytes_total[5m]))

Egress throughput (bytes/sec) per topic:

sum by (topic) (rate(korvet_fetch_bytes_total[5m]))

Total throughput in MB/sec:

(sum(rate(korvet_produce_bytes_total[5m])) + sum(rate(korvet_fetch_bytes_total[5m]))) / 1024 / 1024

Lettuce Redis Metrics

These queries only work when korvet.redis.metrics.enabled=true.

Redis command latency (p99) by command:

histogram_quantile(0.99, sum by (command, le) (rate(lettuce_command_completion_seconds_bucket[5m])))

Redis command rate by command:

sum by (command) (rate(lettuce_command_completion_seconds_count[5m]))

Average Redis command latency by command:

sum by (command) (rate(lettuce_command_completion_seconds_sum[5m])) / sum by (command) (rate(lettuce_command_completion_seconds_count[5m]))

Top 5 slowest Redis commands (p99):

topk(5, histogram_quantile(0.99, sum by (command, le) (rate(lettuce_command_completion_seconds_bucket[5m]))))

JVM Metrics

JVM memory usage:

jvm_memory_used_bytes{area="heap"}

GC pause time (p99):

histogram_quantile(0.99, jvm_gc_pause_seconds_bucket)

CPU usage:

process_cpu_usage