|
For the latest stable version, please use Korvet 0.12.5! |
Storage
Korvet uses Redis Streams as its primary storage layer, with optional tiered storage to Delta Lake for long-term archival.
Storage Architecture
Korvet supports two storage configurations:
Redis-Only Storage (Default)
The default configuration uses Redis Streams exclusively:
-
Primary storage: All messages stored in Redis Streams
-
Persistence: Redis AOF and RDB for durability
-
Consumer groups: Built-in support for coordinated consumption
-
Performance: Sub-millisecond read/write latency
-
Retention: Configurable time and size-based retention (applied at write time)
This is the recommended configuration for most use cases.
Tiered Storage (Experimental)
For long-term data retention and cost optimization, Korvet can be configured with tiered storage:
-
Hot tier: Recent messages in Redis Streams for low-latency access
-
Cold tier: Archived messages in Delta Lake (Parquet format) on S3 or local filesystem
-
Automatic archival: Separate Flink job archives old messages from Redis to Delta Lake
-
Transparent reads: Standalone consumers automatically read from both tiers
-
Consumer groups: Always read from hot tier only (Redis Streams)
|
Tiered storage is not production-ready and is currently experimental. It requires:
Most users should use Redis-only storage. |
How It Works
Redis-Only Storage
-
Produce: Messages are written to Redis Streams using
XADD -
Retention: Retention policies are applied at write time using
MAXLENandMINIDarguments -
Consume: Consumers read messages using
XREAD(standalone) orXREADGROUP(consumer groups) -
Persistence: Redis handles durability through AOF/RDB snapshots
Tiered Storage
-
Produce: Messages are written to Redis Streams (hot tier)
-
Archive: Flink job periodically reads old messages from Redis and writes to Delta Lake (cold tier)
-
Consume:
-
Standalone consumers read from both hot and cold tiers automatically
-
Consumer groups read from hot tier only
-
-
Cleanup: Archived messages can be trimmed from Redis to save memory
Stream Key Format
Each Kafka topic partition maps to a single Redis Stream:
{keyspace}:stream:{topic}:{partition}
Examples (using default keyspace korvet):
korvet:stream:orders:0 # Topic "orders", partition 0 korvet:stream:orders:1 # Topic "orders", partition 1 korvet:stream:payments:0 # Topic "payments", partition 0
Message Encoding
Kafka records are decomposed into Redis Stream fields:
-
Key: Stored in
__keyfield (if present) -
Headers: Stored as
__header.{name}fields -
Value: Encoding depends on value type:
-
JSON: Top-level fields are flattened into separate stream fields
-
Raw bytes: Stored as single
valuefield
-
See Message Format for details.
Benefits
-
Performance: Sub-millisecond latency for hot tier operations
-
Simplicity: Redis-only mode requires no additional infrastructure
-
Reliability: Redis persistence ensures data durability
-
Scalability: Handle millions of messages per second
-
Cost optimization: Optional cold tier reduces storage costs for long-term retention
-
Flexibility: Choose between simplicity (Redis-only) and cost optimization (tiered)