Introduction

RIOT-X is an extension to RIOT which provides the following additional features for Redis Cloud and Redis Software:

  • Observability

  • Memcached Replication

  • Redis Stream Import/Export

riotx dashboard replication
riotx dashboard jvm

Install

RIOT-X can be installed on Linux, macOS, and Windows platforms and can be used as a standalone tool that connects remotely to a Redis database. It is not required to run locally on a Redis server.

Homebrew (macOS & Linux)

brew install redis/tap/riotx

Scoop (Windows)

scoop bucket add redis https://github.com/redis/scoop.git
scoop install riotx

Manual Installation (All Platforms)

Download the pre-compiled binary from RIOT-X Releases, uncompress and copy to the desired location.

riotx-0.2.2.zip requires Java 11 or greater to be installed.

riotx-standalone-0.2.2-*.zip includes its own Java runtime and does not require a Java installation.

Docker

You can run RIOT-X as a docker image:

docker run riotx/riotx [OPTIONS] [COMMAND]

Usage

You can launch RIOT-X with the following command:

riotx

This will show usage help, which you can also get by running:

riotx --help

--help is available on any command:

riotx COMMAND --help

Run the following command to give riotx TAB completion in the current shell:

$ source <(riotx generate-completion)

Memcached Replication

The memcached-replicate command reads data from a source Memcached database and writes to a target Memcached database.

riotx memcached-replicate SOURCE TARGET [OPTIONS]

For the full usage, run:

riotx memcached-replicate --help
Example
riotx memcached-replicate mydb.cache.amazonaws.com:11211 mydb-12211.redis.com:12211 --source-tls

Stream Export

The stream-export command enables Redis CDC to a Redis stream.

riotx stream-export SOURCE TARGET [OPTIONS]

For the full usage, run:

riotx stream-export --help
Example: Export stream to another Redis instance
riotx stream-export redis://localhost:6379 redis://localhost:6380 --mode live

redis-cli -p 6380 xread COUNT 3 STREAMS stream:export 0-0
1) 1) "stream:export"
   2) 1) 1) "1718645537588-0"
         2)  1) "key"
             2) "order:4"
             3) "time"
             4) "1718645537000"
             5) "type"
             6) "hash"
             7) "ttl"
             8) "-1"
             9) "mem"
            10) "136"
            11) "value"
            12) "{\"order_date\":\"2024-06-13 22:19:35.143797\",\"order_id\":\"4\"}"

Stream Import

The stream-import command reads data from a stream and writes it to Redis.

The basic usage is:

riotx stream-import STREAM...

For the full usage, run:

riotx stream-import --help
Example: Import stream into hashes
riotx stream-import stream:beers --idle-timeout 1 hset --keyspace beer --key id

Observability

RIOT-X exposes several metrics over a Prometheus endpoint that can be useful for troubleshooting and performance tuning.

Getting Started

The riotx-dist repository includes a Docker compose configuration that set ups Prometheus and Grafana.

git clone https://github.com/redis-field-engineering/riotx-dist.git
cd riotx-dist
docker compose up

Prometheus is configured to scrape the host every second.

You can access the Grafana dashboard at localhost:3000.

Now start RIOT-X with the following command:

riotx replicate ... --metrics

This will enable the Prometheus metrics exporter endpoint and will populate the Grafana dashboard.

Configuration

Use the --metrics* options to enable and configure metrics:

--metrics

Enable metrics

--metrics-jvm

Enable JVM and system metrics

--metrics-redis

Enable command latency metrics. See https://github.com/redis/lettuce/wiki/Command-Latency-Metrics#micrometer

--metrics-name=<name>

Application name tag that will be applied to all metrics

--metrics-port=<int>

Port that Prometheus HTTP server should listen on (default: 8080)

--metrics-prop=<k=v>

Additional properties to pass to the Prometheus client. See https://prometheus.github.io/client_java/config/config/

Metrics

Below you can find a list of all metrics declared by RIOT-X.

riotx dashboard replication

Replication Metrics

Name Type Description

riotx_replication_bytes_total

Counter

Number of bytes replicated (needs memory usage with --mem-limit)

riotx_replication_lag_seconds

Summary

Replication latency

spring_batch_chunk_write_seconds

Timer

Batch writing duration

spring_batch_item_process_seconds

Timer

Item processing duration

spring_batch_item_read_seconds

Timer

Item reading duration

spring_batch_job_active_seconds

Timer

Active jobs

spring_batch_job_launch_count_total

Counter

Job launch count

spring_batch_redis_key_event_queue_capacity

Gauge

Gauge reflecting the remaining capacity of the queue

spring_batch_redis_key_event_queue_size

Gauge

Gauge reflecting the size (depth) of the queue

spring_batch_redis_key_scan_total

Counter

Number of keys scanned

spring_batch_redis_operation_seconds

Timer

Operation execution duration

spring_batch_redis_read_chunk

Gauge

Gauge reflecting the chunk size of the reader

spring_batch_redis_read_queue_capacity

Gauge

Gauge reflecting the remaining capacity of the queue

spring_batch_redis_read_queue_size

Gauge

Gauge reflecting the size (depth) of the queue

JVM Metrics

Use the --metrics-jvm option to enable the following additional metrics:

riotx dashboard jvm
Name Type Description

jvm_buffer_count_buffers

Gauge

An estimate of the number of buffers in the pool

jvm_buffer_memory_used_bytes

Gauge

An estimate of the memory that the Java virtual machine is using for this buffer pool

jvm_buffer_total_capacity_bytes

Gauge

An estimate of the total capacity of the buffers in this pool

jvm_gc_concurrent_phase_time_seconds

Timer

Time spent in concurrent phase

jvm_gc_live_data_size_bytes

Gauge

Size of long-lived heap memory pool after reclamation

jvm_gc_max_data_size_bytes

Gauge

Max size of long-lived heap memory pool

jvm_gc_memory_allocated_bytes_total

Gauge

Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next

jvm_gc_memory_promoted_bytes_total

Counter

Count of positive increases in the size of the old generation memory pool before GC to after GC

jvm_gc_pause_seconds

Timer

Time spent in GC pause

jvm_memory_committed_bytes

Gauge

The amount of memory in bytes that is committed for the Java virtual machine to use

jvm_memory_max_bytes

Gauge

The maximum amount of memory in bytes that can be used for memory management

jvm_memory_used_bytes

Gauge

The amount of used memory

jvm_threads_daemon_threads

Gauge

The current number of live daemon threads

jvm_threads_live_threads

Gauge

The current number of live threads including both daemon and non-daemon threads

jvm_threads_peak_threads

Gauge

The peak live thread count since the Java virtual machine started or peak was reset

jvm_threads_started_threads_total

Counter

The total number of application threads started in the JVM

jvm_threads_states_threads

Gauge

The current number of threads

process_cpu_time_ns_total

Counter

The "cpu time" used by the Java Virtual Machine process

process_cpu_usage

Gauge

The "recent cpu usage" for the Java Virtual Machine process

process_start_time_seconds

Gauge

Start time of the process since unix epoch.

process_uptime_seconds

Gauge

The uptime of the Java virtual machine

system_cpu_count

Gauge

The number of processors available to the Java virtual machine

system_cpu_usage

Gauge

The "recent cpu usage" of the system the application is running in

system_load_average_1m

Gauge

The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time

Best Practices

This section contains best practices and recipes for various RIOT-X use cases.

Replication Performance Tuning

replication architecture

The replicate command reads from a source Redis database and write to a target Redis database.

Replication Bottleneck

To optimize throughput it is necessary to understand the two main possible scenarios:

  • Slow Producer: In this scenario the reader does not read from source as fast as the writer can write to the target. This means the writer is starved and we should look into ways to speed up the reader.

  • Slow Consumer:: In this scenario the writer can not keep up with the reader and we should look into optimizing writes.

There are two ways to identify which scenario we fall into:

  • No-op writer: With the --dry-run option the replication process will use a no-op writer instead of a Redis writer. If throughput with dry-run is similar to throughput without then the writer is not the bottleneck. Follow steps below to improve reader throughput.

  • Reader queue utilization: Using the Grafana dashboard you can monitor reader queue depth. A low queue utilization means the writer can keep up with the reader. A queue utilization close to 100% means writes are slower than reads.

Reader

To improve reader performance tweak the options below until you reach ooptimal throughput.

--read-threads

How many value reader threads to use in parallel (default: 1)

--read-batch

Number of values each reader thread should read in a single pipelined call (default: 50)

--read-queue

Capacity of the reader queue (default: 10000). When the queue is full the threads wait for space to become available. Increase this value if you have peaky traffic on the source database causing fluctuating reader throughput.

--source-pool

Number of Redis connections to the source database (default: 8). Keep in sync with the number of threads to have a dedicated connection per thread.

Writer

To improve writer performance you can tweak the following options: --batch:: Number of items written in a single network round-trip to the Redis server (i.e. number of commands in the pipeline) --threads:: How many write operations can be performed concurrently (default: 1) --target-pool:: Number of Redis connections to the target database (default: 8). Keep in sync with the number of threads to have a dedicated connection per thread.

System Requirements

Operating System

RIOT-X works on all major operating systems but has been tested at scale on Linux X86 64-bit platforms.

CPU

CPU used by RIOT-X varies greatly dependending on specific replication settings and data structures at play. You can monitor CPU usage with the supplied Grafana dashboard (process_cpu_usage metric).

Disk

RIOT-X does not require any specific disk requirements since all state is kept in memory.

Memory

Memory requirements for RIOT-X itself are very light. Being JVM-based the default initial heap size is dependent on available system memory and on the operating system.

If you have very intensive replication requirements you will need to increase the JVM heap size. To estimate the worst case scenario for memory requirements you can use this formula: keySize * queueSize where:

keySize

average key size as reported by the MEMORY USAGE command

queueSize

Redis reader queue capacity configured with the --read-queue option

Conversely if you need to minimize memory used by RIOT-X you can lower the reader queue size (but possibly at the expense of reader throughtput).

Network

RIOT-X replication is essentially a network bridge between the source and target Redis databases so underlying network is crucial for the overall throughput and a 10 Gigabit network is the minimum recommended. Network latency will also have an impact on replication (and other RIOT-X uses) performance. Make sure the host running RIOT-X offers minimal latency to both the source and target databases. You can test the latency using the ping command.

CRDB

Active/active Redis databases (CRDB) need special considerations. If your target database is a CRDB deployment you will need to use the data-structure replication type (--struct) as the RESTORE command is not supported in CRDB.