|
For the latest stable version, please use Korvet 0.12.5! |
Monitoring
Korvet provides comprehensive monitoring through Spring Boot Actuator and Micrometer.
Health Checks
Korvet exposes health check endpoints:
# Overall health
curl http://localhost:8080/actuator/health
# Liveness probe (for Kubernetes)
curl http://localhost:8080/actuator/health/liveness
# Readiness probe (for Kubernetes)
curl http://localhost:8080/actuator/health/readiness
Metrics
Metrics are exposed in Prometheus format:
curl http://localhost:8080/actuator/prometheus
Available Metrics
Korvet Custom Metrics
-
korvet.produce.messages: Number of messages produced (tags:
topic,partition) -
korvet.produce.latency: Produce request latency histogram (tags:
topic,partition) -
korvet.fetch.requests: Number of fetch requests (tags:
topic,partition) -
korvet.fetch.latency: Fetch request latency histogram (tags:
topic,partition) -
korvet.fetch.messages: Number of messages fetched (tags:
topic,partition,tier) -
korvet.errors: Number of errors (tags:
operation,error_type)
JVM and System Metrics
Standard JVM and system metrics from Micrometer:
-
jvm.memory.used: JVM memory used
-
jvm.memory.max: JVM maximum memory
-
jvm.gc.pause: Garbage collection pause time
-
jvm.threads.live: Live threads
-
process.cpu.usage: Process CPU usage
-
system.cpu.usage: System CPU usage
-
system.load.average.1m: System load average
Prometheus Configuration
Add Korvet to your Prometheus scrape config:
scrape_configs:
- job_name: 'korvet'
static_configs:
- targets: ['korvet:8080']
metrics_path: '/actuator/prometheus'
Grafana Dashboards
A complete monitoring stack with Prometheus and Grafana is available in the korvet-dist samples. See the Grafana sample README for quick start instructions.
The pre-built dashboard visualizes:
-
Kafka Operations: Produce/fetch rates, latency percentiles (p50, p95, p99)
-
Storage Tiers: Messages fetched by tier (hot/cold/warm)
-
Errors: Error rates by operation and type
-
Resources: CPU usage, memory usage
-
JVM: Heap memory, GC pauses, thread count
-
System: Load average, disk space
Alerting
Logging
See Logging for log-based monitoring.