Design Decisions
This page documents key design decisions made in the Kafka-to-Redis mapping.
Offset Mapping Strategy
Challenge: Kafka uses sequential integer offsets (0, 1, 2…), while Redis Streams use timestamp-based entry IDs (e.g., 1234567890123-0).
Selected Approach: Stateless Encoding
Kafka offsets are computed directly from Redis Stream entry IDs using bit-packing:
Entry ID format: {timestamp}-{sequence}
Kafka offset: (timestamp << sequenceBits) | sequence
Configuration:
-
sequenceBits: Number of bits allocated for sequence number (default: 10) -
Maximum sequence per millisecond: 2^sequenceBits - 1 (1023 for 10 bits)
-
If sequence exceeds maximum, Redis automatically increments timestamp
Advantages:
-
No extra storage: No hash mappings needed
-
Stateless: Offset ↔ Entry ID conversion is pure computation
-
Monotonic: Offsets increase monotonically like Kafka
-
Efficient: O(1) conversion in both directions
Trade-offs:
-
Offsets are large numbers (not 0, 1, 2…)
-
Requires configuring sequenceBits based on throughput
-
High-throughput partitions (>1000 msg/ms) need more sequence bits
Consumer Group Coordination
Challenge: Kafka has a complex group coordination protocol (join, sync, heartbeat, rebalance).
Approach:
-
Use Redis Streams native consumer groups for message delivery and offset tracking
-
Implement Kafka group coordinator protocol in broker (in-memory state)
-
Use
XGROUP CREATEto create consumer groups -
Use
XREADGROUPfor group-based message consumption -
Use
XACKfor offset commits (acknowledging messages) -
Group membership, assignments, and rebalancing managed by broker (not stored in Redis)
Retention and Cleanup
Challenge: Kafka supports time-based and size-based retention policies.
Approach:
-
Retention enforced at write time using
XADDarguments (not background job) -
Size-based retention:
XADD … MAXLEN {count}(exact trimming by default for predictable retention) -
Time-based retention:
XADD … MINID {timestamp}-0(trim messages older than timestamp) -
Approximate trimming available per-topic via
approximate.trimming=trueconfig for better performance -
Retention configuration stored in topic metadata (Redis Hash)
-
Message count calculated from
retention.bytesand average message size
Implementation: See ProduceHandler.xAddArgs() which applies retention policies to every XADD command.
Transactional Writes
Challenge: Kafka supports atomic writes across partitions.
Approach:
-
Use Redis transactions (
MULTI/EXEC) for single-partition atomicity -
Use Lua scripts for multi-partition atomic writes
-
Store transaction markers in stream entries
-
Implement idempotent producer semantics
Performance Optimization
Strategies:
-
Pipelining: Batch multiple
XADDcommands in a single network round-trip -
Connection Pooling: Reuse Redis connections across requests
-
Batch Reads: Use
XREADwith COUNT to fetch multiple messages efficiently -
Group Reads: Use
XREADGROUPto read from multiple streams in single call
Open Questions
Multi-Tenancy
Question: How to isolate topics across tenants?
Options:
-
Prefix all keys with tenant ID:
{tenant}:{topic}:{partition} -
Use separate Redis databases per tenant
-
Use Redis ACLs for access control
Replication
Question: How to handle Kafka’s replication factor > 1?
Options:
-
Use Redis Cluster or Redis Enterprise for replication
-
Ignore replication factor, rely on Redis persistence (AOF/RDB)
-
Implement application-level replication across Redis instances