This version is still in development and is not considered stable yet. For the latest stable version, please use Korvet 0.12.5!

Design Decisions

This page documents key design decisions made in the Kafka-to-Redis mapping.

This page is for contributors and operators who need implementation rationale and tradeoffs.

Offset Mapping Strategy

Challenge: Kafka uses sequential integer offsets (0, 1, 2…), while Redis Streams use timestamp-based entry IDs (e.g., 1234567890123-0).

Selected Approach: Stateless Encoding

Kafka offsets are computed directly from Redis Stream entry IDs using bit-packing:

Entry ID format:  {timestamp}-{sequence}
Kafka offset:     (timestamp << sequenceBits) | sequence

Configuration:

sequenceBits: Number of bits allocated for sequence number (default: 10)
Maximum sequence per millisecond: 2^sequenceBits - 1 (1023 for 10 bits)
If sequence exceeds maximum, Redis automatically increments timestamp

Advantages:

No extra storage: No hash mappings needed
Stateless: Offset ↔ Entry ID conversion is pure computation
Monotonic: Offsets increase monotonically like Kafka
Efficient: O(1) conversion in both directions

Trade-offs:

Offsets are large numbers (not 0, 1, 2…)
Requires configuring sequenceBits based on throughput
High-throughput partitions (>1000 msg/ms) need more sequence bits

Consumer Group Coordination

Challenge: Kafka has a complex group coordination protocol (join, sync, heartbeat, rebalance).

Approach:

Use Redis Streams native consumer groups for message delivery and offset tracking
Implement Kafka group coordinator protocol in broker (in-memory state)
Use XGROUP CREATE to create consumer groups
Use XREADGROUP for group-based message consumption
Use XACK to acknowledge delivered Redis entries during commit processing
Persist committed Kafka offsets in a dedicated committed-offset store
Group membership, assignments, and rebalancing are managed by broker state rather than Redis stream metadata

Retention and Cleanup

Challenge: Kafka supports time-based and size-based retention policies.

Approach:

Retention enforced at write time using XADD arguments (not background job)
Size-based retention: XADD … MAXLEN {count} (exact trimming by default for predictable retention)
Time-based retention: XADD … MINID {timestamp}-0 (trim messages older than timestamp)
Approximate trimming available per-topic via approximate.trimming=true config for better performance
Retention configuration stored in topic metadata (Redis Hash)
Message count calculated from retention.bytes and average message size

Implementation: See ProduceHandler.xAddArgs() which applies retention policies to every XADD command.

Transactional Writes

Challenge: Kafka supports atomic writes across partitions.

Approach:

Use Redis transactions (MULTI/EXEC) for single-partition atomicity
Use Lua scripts for multi-partition atomic writes
Store transaction markers in stream entries
Implement idempotent producer semantics

Performance Optimization

Strategies:

Pipelining: Batch multiple XADD commands in a single network round-trip
Connection Pooling: Reuse Redis connections across requests
Batch Reads: Use XREAD with COUNT to fetch multiple messages efficiently
Group Reads: Use XREADGROUP to read from multiple streams in single call

Open Questions

Multi-Tenancy

Question: How to isolate topics across tenants?

Options:

Prefix all keys with tenant ID: {tenant}:{topic}:{partition}
Use separate Redis databases per tenant
Use Redis ACLs for access control

Replication

Question: How to handle Kafka’s replication factor > 1?

Options:

Use Redis Cluster or Redis Enterprise for replication
Ignore replication factor, rely on Redis persistence (AOF/RDB)
Implement application-level replication across Redis instances

Log Compaction

Question: How to support Kafka’s log compaction feature?

Options:

Background process that reads stream, deduplicates by key, writes to new stream
Use Redis Streams with custom compaction logic
Store compacted view in separate structure

Large Messages

Question: How to handle messages larger than Redis limits?

Options:

Store large payloads in separate keys, reference in stream entry
Reject messages above threshold with error response
Implement chunking for large messages

Design Decisions

Offset Mapping Strategy

Consumer Group Coordination

Retention and Cleanup

Transactional Writes

Performance Optimization

Open Questions

Multi-Tenancy

Replication

Log Compaction

Large Messages

Related Documentation