This version is still in development and is not considered stable yet. For the latest stable version, please use Korvet 0.12.5!

Design Decisions

This page documents key design decisions made in the Kafka-to-Redis mapping.

This page is for contributors and operators who need implementation rationale and tradeoffs.

Offset Mapping Strategy

Challenge: Kafka uses sequential integer offsets (0, 1, 2…​), while Redis Streams use timestamp-based entry IDs (e.g., 1234567890123-0).

Selected Approach: Stateless Encoding

Kafka offsets are computed directly from Redis Stream entry IDs using bit-packing:

Entry ID format:  {timestamp}-{sequence}
Kafka offset:     (timestamp << sequenceBits) | sequence

Configuration:

  • sequenceBits: Number of bits allocated for sequence number (default: 10)

  • Maximum sequence per millisecond: 2^sequenceBits - 1 (1023 for 10 bits)

  • If sequence exceeds maximum, Redis automatically increments timestamp

Advantages:

  • No extra storage: No hash mappings needed

  • Stateless: Offset ↔ Entry ID conversion is pure computation

  • Monotonic: Offsets increase monotonically like Kafka

  • Efficient: O(1) conversion in both directions

Trade-offs:

  • Offsets are large numbers (not 0, 1, 2…​)

  • Requires configuring sequenceBits based on throughput

  • High-throughput partitions (>1000 msg/ms) need more sequence bits

Consumer Group Coordination

Challenge: Kafka has a complex group coordination protocol (join, sync, heartbeat, rebalance).

Approach:

  • Use Redis Streams native consumer groups for message delivery and offset tracking

  • Implement Kafka group coordinator protocol in broker (in-memory state)

  • Use XGROUP CREATE to create consumer groups

  • Use XREADGROUP for group-based message consumption

  • Use XACK to acknowledge delivered Redis entries during commit processing

  • Persist committed Kafka offsets in a dedicated committed-offset store

  • Group membership, assignments, and rebalancing are managed by broker state rather than Redis stream metadata

Retention and Cleanup

Challenge: Kafka supports time-based and size-based retention policies.

Approach:

  • Retention enforced at write time using XADD arguments (not background job)

  • Size-based retention: XADD …​ MAXLEN {count} (exact trimming by default for predictable retention)

  • Time-based retention: XADD …​ MINID {timestamp}-0 (trim messages older than timestamp)

  • Approximate trimming available per-topic via approximate.trimming=true config for better performance

  • Retention configuration stored in topic metadata (Redis Hash)

  • Message count calculated from retention.bytes and average message size

Implementation: See ProduceHandler.xAddArgs() which applies retention policies to every XADD command.

Transactional Writes

Challenge: Kafka supports atomic writes across partitions.

Approach:

  • Use Redis transactions (MULTI/EXEC) for single-partition atomicity

  • Use Lua scripts for multi-partition atomic writes

  • Store transaction markers in stream entries

  • Implement idempotent producer semantics

Performance Optimization

Strategies:

  1. Pipelining: Batch multiple XADD commands in a single network round-trip

  2. Connection Pooling: Reuse Redis connections across requests

  3. Batch Reads: Use XREAD with COUNT to fetch multiple messages efficiently

  4. Group Reads: Use XREADGROUP to read from multiple streams in single call

Open Questions

Multi-Tenancy

Question: How to isolate topics across tenants?

Options:

  • Prefix all keys with tenant ID: {tenant}:{topic}:{partition}

  • Use separate Redis databases per tenant

  • Use Redis ACLs for access control

Replication

Question: How to handle Kafka’s replication factor > 1?

Options:

  • Use Redis Cluster or Redis Enterprise for replication

  • Ignore replication factor, rely on Redis persistence (AOF/RDB)

  • Implement application-level replication across Redis instances

Log Compaction

Question: How to support Kafka’s log compaction feature?

Options:

  • Background process that reads stream, deduplicates by key, writes to new stream

  • Use Redis Streams with custom compaction logic

  • Store compacted view in separate structure

Large Messages

Question: How to handle messages larger than Redis limits?

Options:

  • Store large payloads in separate keys, reference in stream entry

  • Reject messages above threshold with error response

  • Implement chunking for large messages