For the latest stable version, please use Korvet 0.12.5!

Design Decisions and Future Considerations

This page documents key design decisions made in the Kafka-to-Redis mapping and future considerations.

Offset Mapping Strategy

Challenge: Kafka uses sequential integer offsets (0, 1, 2…​), while Redis Streams use timestamp-based entry IDs (e.g., 1234567890123-0).

Selected Approach: Stateless Encoding

Kafka offsets are computed directly from Redis Stream entry IDs using bit-packing:

Entry ID format:  {timestamp}-{sequence}
Kafka offset:     (timestamp << sequenceBits) | sequence

Configuration:

  • sequenceBits: Number of bits allocated for sequence number (default: 10)

  • Maximum sequence per millisecond: 2^sequenceBits - 1 (1023 for 10 bits)

  • If sequence exceeds maximum, Redis automatically increments timestamp

Advantages:

  • No extra storage: No hash mappings needed

  • Stateless: Offset ↔ Entry ID conversion is pure computation

  • Monotonic: Offsets increase monotonically like Kafka

  • Efficient: O(1) conversion in both directions

Trade-offs:

  • Offsets are large numbers (not 0, 1, 2…​)

  • Requires configuring sequenceBits based on throughput

  • High-throughput partitions (>1000 msg/ms) need more sequence bits

Consumer Group Coordination

Challenge: Kafka has a complex group coordination protocol (join, sync, heartbeat, rebalance).

Approach:

  • Use Redis Streams native consumer groups for message delivery and offset tracking

  • Implement Kafka group coordinator protocol in broker (in-memory state)

  • Use XGROUP CREATE to create consumer groups

  • Use XREADGROUP for group-based message consumption

  • Use XACK for offset commits (acknowledging messages)

  • Group membership, assignments, and rebalancing managed by broker (not stored in Redis)

Retention and Cleanup

Challenge: Kafka supports time-based and size-based retention policies.

Approach:

  • Use XTRIM for size-based retention: XTRIM {stream} MAXLEN ~ 1000000

  • Use XTRIM with MINID for time-based retention: XTRIM {stream} MINID {timestamp}-0

  • Retention configuration managed by broker (in-memory or external config)

  • Background job periodically trims streams based on retention policy

Retention policy enforcement is currently in development.

Transactional Writes

Challenge: Kafka supports atomic writes across partitions.

Approach:

  • Use Redis transactions (MULTI/EXEC) for single-partition atomicity

  • Use Lua scripts for multi-partition atomic writes

  • Store transaction markers in stream entries

  • Implement idempotent producer semantics

Performance Optimization

Strategies:

  1. Pipelining: Batch multiple XADD commands in a single network round-trip

  2. Connection Pooling: Reuse Redis connections across requests

  3. Batch Reads: Use XREAD with COUNT to fetch multiple messages efficiently

  4. Group Reads: Use XREADGROUP to read from multiple streams in single call

Future Considerations

Multi-Tenancy

Question: How to isolate topics across tenants?

Options:

  • Prefix all keys with tenant ID: {tenant}:{topic}:{partition}

  • Use separate Redis databases per tenant

  • Use Redis ACLs for access control

Replication

Question: How to handle Kafka’s replication factor > 1?

Options:

  • Use Redis Cluster or Redis Enterprise for replication

  • Ignore replication factor, rely on Redis persistence (AOF/RDB)

  • Implement application-level replication across Redis instances

Log Compaction

Question: How to support Kafka’s log compaction feature?

Options:

  • Background process that reads stream, deduplicates by key, writes to new stream

  • Use Redis Streams with custom compaction logic

  • Store compacted view in separate structure

Large Messages

Question: How to handle messages larger than Redis limits?

Options:

  • Store large payloads in separate keys, reference in stream entry

  • Reject messages above threshold with error response

  • Implement chunking for large messages