Redis OpenTelemetry Dashboards¶
Redis database monitoring dashboards using OpenTelemetry Redis receiver metrics.
Overview¶
These dashboards provide comprehensive monitoring for Redis instances, including client connections, memory usage, command throughput, keyspace statistics, and replication metrics.
Dashboards¶
| Dashboard | File | Description |
|---|---|---|
| Overview | overview.yaml |
Multi-instance monitoring with key metrics across all Redis instances |
| Instance Details | instance-details.yaml |
Detailed single-instance analysis including memory, connections, keyspace, and replication metrics |
| Database Metrics | database-metrics.yaml |
Per-database keyspace metrics including keys, TTL, and expiration statistics |
All dashboards include navigation links for easy switching between views.
Dashboard Definitions¶
Overview (overview.yaml)
---
# Redis OpenTelemetry Overview Dashboard (ES|QL Version)
# Provides cluster-level metrics and instance health monitoring
#
# Metrics reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver
# Requires metrics-* with data_stream.dataset == "redisreceiver.otel"
dashboards:
- id: redis-overview
name: '[Metrics Redis] Overview'
description: OpenTelemetry Redis Receiver - Overview dashboard showing key metrics across all Redis instances (ES|QL)
filters:
- field: data_stream.dataset
equals: redisreceiver.otel
controls:
- type: options
id: instance-filter
label: Instance
data_view: metrics-*
field: resource.attributes.service.instance.id
panels:
# Navigation Links
- size: {w: 48, h: 2}
links:
layout: horizontal
items:
- label: Overview
dashboard: redis-overview
- label: Instance Details
dashboard: redis-instance-details
- label: Database Metrics
dashboard: redis-database-metrics
- label: Redis Documentation
url: https://redis.io/docs/
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Redis Instance Overview'
# KPI Metrics Row
- title: Total Redis Instances
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE resource.attributes.service.instance.id IS NOT NULL
| STATS instances = COUNT_DISTINCT(resource.attributes.service.instance.id)
primary:
field: instances
label: Instances
format:
type: number
decimals: 0
- title: Total Client Connections
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.clients.connected IS NOT NULL
| STATS clients = SUM(LAST_OVER_TIME(redis.clients.connected))
BY instance = resource.attributes.service.instance.id
| STATS clients = SUM(clients)
primary:
field: clients
label: Clients
format:
type: number
decimals: 0
- title: Commands/sec
hide_title: true
size: {w: 8, h: 5}
description: Gauge metric showing instantaneous commands per second
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.commands IS NOT NULL
| STATS commands_per_sec = SUM(LAST_OVER_TIME(redis.commands))
BY instance = resource.attributes.service.instance.id
| STATS commands_per_sec = SUM(commands_per_sec)
primary:
field: commands_per_sec
label: Commands/sec
- title: Cache Hit Rate
description: >-
Hits / (hits + misses). Values <80% indicate cache misses increasing
backend load.
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
| STATS hits = SUM(RATE(redis.keyspace.hits)), misses = SUM(RATE(redis.keyspace.misses))
| EVAL hit_rate = hits / (hits + misses + 0.000001)
| KEEP hit_rate
primary:
field: hit_rate
label: Hit Rate
format:
type: percent
- title: Total Memory Usage
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.used IS NOT NULL
| STATS memory = SUM(LAST_OVER_TIME(redis.memory.used))
BY instance = resource.attributes.service.instance.id
| STATS memory = SUM(memory)
primary:
field: memory
label: Memory
format:
type: bytes
- title: Avg Memory Fragmentation
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.fragmentation_ratio IS NOT NULL
| STATS fragmentation = AVG(LAST_OVER_TIME(redis.memory.fragmentation_ratio))
primary:
field: fragmentation
label: Fragmentation
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Memory & Performance Trends'
# Memory Usage by Instance (Gauge - use LAST_OVER_TIME for current state)
- title: Memory Usage by Instance
size: {w: 24, h: 12}
esql:
type: area
mode: stacked
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.used IS NOT NULL
| STATS memory = MAX(LAST_OVER_TIME(redis.memory.used))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: memory
label: Memory Used
format:
type: bytes
breakdown:
field: instance
# Commands Per Second (Gauge metric - use LAST_OVER_TIME for current state)
- title: Commands Per Second
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.commands IS NOT NULL
| STATS commands = MAX(LAST_OVER_TIME(redis.commands))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: commands
label: Commands/sec
breakdown:
field: instance
# Client Connections Over Time (Gauge - use LAST_OVER_TIME for current state)
- title: Client Connections Over Time
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.clients.connected IS NOT NULL
| STATS clients = MAX(LAST_OVER_TIME(redis.clients.connected))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: clients
label: Connected Clients
format:
type: number
decimals: 0
breakdown:
field: instance
# Keyspace Hit/Miss Rate (Counter metrics - use RATE)
- title: Keyspace Hit/Miss Rate
size: {w: 24, h: 12}
esql:
type: area
mode: stacked
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
| STATS hits = SUM(RATE(redis.keyspace.hits)), misses = SUM(RATE(redis.keyspace.misses))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: hits
label: Hits/sec
- field: misses
label: Misses/sec
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Key Eviction & Expiration'
# Keys Evicted Rate (Counter - use RATE)
- title: Keys Evicted Rate
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keys.evicted IS NOT NULL
| STATS evicted = SUM(RATE(redis.keys.evicted))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: evicted
label: Evicted/sec
breakdown:
field: instance
# Keys Expired Rate (Counter - use RATE)
- title: Keys Expired Rate
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keys.expired IS NOT NULL
| STATS expired = SUM(RATE(redis.keys.expired))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: expired
label: Expired/sec
breakdown:
field: instance
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Network I/O'
# Network Input (Counter - use RATE)
- title: Network Input (Bytes Received)
size: {w: 24, h: 12}
esql:
type: area
mode: stacked
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.net.input IS NOT NULL
| STATS input = SUM(RATE(redis.net.input))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: input
label: Input
format:
type: bytes
breakdown:
field: instance
# Network Output (Counter - use RATE)
- title: Network Output (Bytes Sent)
size: {w: 24, h: 12}
esql:
type: area
mode: stacked
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.net.output IS NOT NULL
| STATS output = SUM(RATE(redis.net.output))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: output
label: Output
format:
type: bytes
breakdown:
field: instance
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Instance Summary Table'
# Instance Summary DataTable
- title: Redis Instance Metrics Summary
size: {w: 48, h: 20}
esql:
type: datatable
query:
- TS metrics-*
- WHERE data_stream.dataset == "redisreceiver.otel"
- WHERE resource.attributes.service.instance.id IS NOT NULL
- STATS memory = MAX(LAST_OVER_TIME(redis.memory.used)), clients = MAX(LAST_OVER_TIME(redis.clients.connected)), commands = MAX(LAST_OVER_TIME(redis.commands)),
fragmentation = MAX(LAST_OVER_TIME(redis.memory.fragmentation_ratio)) BY instance = resource.attributes.service.instance.id
- KEEP instance, memory, clients, commands, fragmentation
- SORT memory DESC
- LIMIT 100
breakdowns:
- field: instance
label: Instance
- field: memory
label: Memory Used
- field: clients
label: Clients
- field: commands
label: Commands/sec
- field: fragmentation
label: Fragmentation Ratio
Instance Details (instance-details.yaml)
---
# Redis OpenTelemetry Instance Details Dashboard (ES|QL Version)
# Detailed metrics for a specific Redis instance
#
# Metrics reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver
# Requires metrics-* with data_stream.dataset == "redisreceiver.otel"
dashboards:
- id: redis-instance-details
name: '[Metrics Redis] Instance Details'
description: OpenTelemetry Redis Receiver - Detailed metrics for a specific Redis instance (ES|QL)
filters:
- field: data_stream.dataset
equals: redisreceiver.otel
controls:
- type: options
id: instance-filter
label: Instance
data_view: metrics-*
field: resource.attributes.service.instance.id
panels:
# Navigation Links
- size: {w: 48, h: 2}
links:
layout: horizontal
items:
- label: Overview
dashboard: redis-overview
- label: Instance Details
dashboard: redis-instance-details
- label: Database Metrics
dashboard: redis-database-metrics
- label: Redis Documentation
url: https://redis.io/docs/
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Redis Instance Health'
# Health KPI Row
- title: Uptime
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.uptime IS NOT NULL
| STATS uptime = MAX(TO_DOUBLE(redis.uptime))
primary:
field: uptime
label: Uptime (s)
- title: Connected Clients
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.clients.connected IS NOT NULL
| STATS clients = MAX(LAST_OVER_TIME(redis.clients.connected))
primary:
field: clients
label: Clients
format:
type: number
decimals: 0
- title: Blocked Clients
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.clients.blocked IS NOT NULL
| STATS blocked = MAX(LAST_OVER_TIME(redis.clients.blocked))
primary:
field: blocked
label: Blocked
format:
type: number
decimals: 0
- title: Commands/sec
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.commands IS NOT NULL
| STATS commands = MAX(LAST_OVER_TIME(redis.commands))
primary:
field: commands
label: Cmd/sec
- title: Connected Replicas
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.slaves.connected IS NOT NULL
| STATS replicas = MAX(LAST_OVER_TIME(redis.slaves.connected))
primary:
field: replicas
label: Replicas
format:
type: number
decimals: 0
- title: Cache Hit Rate
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
| STATS hits = SUM(RATE(redis.keyspace.hits)), misses = SUM(RATE(redis.keyspace.misses))
| EVAL hit_rate = hits / (hits + misses + 0.000001)
| KEEP hit_rate
primary:
field: hit_rate
label: Hit Rate
format:
type: percent
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Memory Analysis'
# Memory KPI Row
- title: Memory Used
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.used IS NOT NULL
| STATS memory = MAX(LAST_OVER_TIME(redis.memory.used))
primary:
field: memory
label: Used
format:
type: bytes
- title: Memory Peak
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.peak IS NOT NULL
| STATS peak = MAX(LAST_OVER_TIME(redis.memory.peak))
primary:
field: peak
label: Peak
format:
type: bytes
- title: Memory RSS
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.rss IS NOT NULL
| STATS rss = MAX(LAST_OVER_TIME(redis.memory.rss))
primary:
field: rss
label: RSS
format:
type: bytes
- title: Fragmentation Ratio
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.fragmentation_ratio IS NOT NULL
| STATS fragmentation = MAX(LAST_OVER_TIME(redis.memory.fragmentation_ratio))
primary:
field: fragmentation
label: Fragmentation
- title: Lua Memory
hide_title: true
size: {w: 8, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.lua IS NOT NULL
| STATS lua = MAX(LAST_OVER_TIME(redis.memory.lua))
primary:
field: lua
label: Lua
format:
type: bytes
# Memory Usage Trend (Gauge metrics - use LAST_OVER_TIME for current state)
- title: Memory Usage Trend
size: {w: 48, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.used IS NOT NULL OR redis.memory.rss IS NOT NULL OR redis.memory.peak IS NOT NULL
| STATS used = MAX(LAST_OVER_TIME(redis.memory.used)),
rss = MAX(LAST_OVER_TIME(redis.memory.rss)),
peak = MAX(LAST_OVER_TIME(redis.memory.peak))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: used
label: Used Memory
format:
type: bytes
- field: rss
label: RSS
format:
type: bytes
- field: peak
label: Peak
format:
type: bytes
# Memory Fragmentation Over Time
- title: Memory Fragmentation Over Time
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.fragmentation_ratio IS NOT NULL
| STATS fragmentation = MAX(LAST_OVER_TIME(redis.memory.fragmentation_ratio))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: fragmentation
label: Fragmentation Ratio
# Lua Script Memory Usage
- title: Lua Script Memory Usage
size: {w: 24, h: 12}
esql:
type: area
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.memory.lua IS NOT NULL
| STATS lua = MAX(LAST_OVER_TIME(redis.memory.lua))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: lua
label: Lua Memory
format:
type: bytes
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Client Connections & Performance'
# Client Connections Over Time
- title: Client Connections Over Time
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.clients.connected IS NOT NULL OR redis.clients.blocked IS NOT NULL
| STATS connected = MAX(LAST_OVER_TIME(redis.clients.connected)),
blocked = MAX(LAST_OVER_TIME(redis.clients.blocked))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: connected
label: Connected
format:
type: number
decimals: 0
- field: blocked
label: Blocked
format:
type: number
decimals: 0
# Connection Activity (Counters - use RATE)
- title: Connection Activity
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.connections.received IS NOT NULL OR redis.connections.rejected IS NOT NULL
| STATS received = SUM(RATE(redis.connections.received)),
rejected = SUM(RATE(redis.connections.rejected))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: received
label: Received/sec
- field: rejected
label: Rejected/sec
# Client Buffer Sizes (Gauge metrics)
- title: Client Buffer Sizes
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.clients.max_input_buffer IS NOT NULL OR redis.clients.max_output_buffer IS NOT NULL
| STATS max_input = MAX(MAX_OVER_TIME(redis.clients.max_input_buffer)),
max_output = MAX(MAX_OVER_TIME(redis.clients.max_output_buffer))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: max_input
label: Max Input Buffer
format:
type: bytes
- field: max_output
label: Max Output Buffer
format:
type: bytes
# Commands Processed Rate (Counter - use RATE)
- title: Commands Processed Rate
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.commands.processed IS NOT NULL
| STATS commands = SUM(RATE(redis.commands.processed))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: commands
label: Commands/sec
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Keyspace Operations'
# Keyspace Hit/Miss Ratio (Counter - use RATE)
- title: Keyspace Hit/Miss Ratio
size: {w: 24, h: 12}
esql:
type: area
mode: stacked
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
| STATS hits = SUM(RATE(redis.keyspace.hits)),
misses = SUM(RATE(redis.keyspace.misses))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: hits
label: Hits/sec
- field: misses
label: Misses/sec
# Cache Hit Rate Over Time
- title: Cache Hit Rate Over Time
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
| STATS hits = SUM(RATE(redis.keyspace.hits)),
misses = SUM(RATE(redis.keyspace.misses))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| EVAL hit_rate = hits / (hits + misses + 0.000001)
| KEEP time_bucket, hit_rate
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: hit_rate
label: Hit Rate
format:
type: percent
# Key Eviction & Expiration (Counters - use RATE)
- title: Key Eviction & Expiration
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.keys.evicted IS NOT NULL OR redis.keys.expired IS NOT NULL
| STATS evicted = SUM(RATE(redis.keys.evicted)),
expired = SUM(RATE(redis.keys.expired))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: evicted
label: Evicted/sec
- field: expired
label: Expired/sec
# RDB Changes Since Last Save
- title: RDB Changes Since Last Save
size: {w: 24, h: 12}
esql:
type: area
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.rdb.changes_since_last_save IS NOT NULL
| STATS changes = MAX(MAX_OVER_TIME(redis.rdb.changes_since_last_save))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: changes
label: Unsaved Changes
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Network I/O'
# Network Traffic (Counters - use RATE)
- title: Network Traffic
size: {w: 48, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.net.input IS NOT NULL OR redis.net.output IS NOT NULL
| STATS input = SUM(RATE(redis.net.input)),
output = SUM(RATE(redis.net.output))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: input
label: Input (bytes/sec)
format:
type: bytes
- field: output
label: Output (bytes/sec)
format:
type: bytes
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## CPU Usage'
# CPU Time by Type (Counter with attribute - use RATE)
- title: CPU Time by Type
size: {w: 48, h: 12}
esql:
type: area
mode: stacked
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.cpu.time IS NOT NULL
| STATS cpu_time = SUM(RATE(redis.cpu.time))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), state = attributes.state
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: cpu_time
label: CPU Time
breakdown:
field: state
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Replication'
# Replication Offset (Gauge)
- title: Replication Offset
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.replication.offset IS NOT NULL
| STATS offset = MAX(LAST_OVER_TIME(redis.replication.offset))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: offset
label: Replication Offset
format:
type: bytes
# Replication Backlog (Gauge)
- title: Replication Backlog
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.replication.backlog_first_byte_offset IS NOT NULL
| STATS backlog = MAX(LAST_OVER_TIME(redis.replication.backlog_first_byte_offset))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: backlog
label: Backlog Offset
format:
type: bytes
# Connected Replicas Over Time
- title: Connected Replicas Over Time
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.slaves.connected IS NOT NULL
| STATS replicas = MAX(LAST_OVER_TIME(redis.slaves.connected))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: replicas
label: Replicas
format:
type: number
decimals: 0
# Fork Duration (Gauge)
- title: Fork Duration
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.latest_fork IS NOT NULL
| STATS fork_duration = MAX(MAX_OVER_TIME(redis.latest_fork))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
| SORT time_bucket ASC
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: fork_duration
label: Fork Duration (μs)
Database Metrics (database-metrics.yaml)
---
# Redis OpenTelemetry Database Metrics Dashboard (ES|QL Version)
# Per-database keyspace metrics
#
# Metrics reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver
# Requires metrics-* with data_stream.dataset == "redisreceiver.otel"
dashboards:
- id: redis-database-metrics
name: '[Metrics Redis] Database Metrics'
description: OpenTelemetry Redis Receiver - Per-database keyspace metrics (ES|QL)
filters:
- field: data_stream.dataset
equals: redisreceiver.otel
controls:
- type: options
id: instance-filter
label: Instance
data_view: metrics-*
field: resource.attributes.service.instance.id
- type: options
id: db-filter
label: Database
data_view: metrics-*
field: attributes.db
width: small
panels:
# Navigation Links
- size: {w: 48, h: 2}
links:
layout: horizontal
items:
- label: Overview
dashboard: redis-overview
- label: Instance Details
dashboard: redis-instance-details
- label: Database Metrics
dashboard: redis-database-metrics
- label: Redis Documentation
url: https://redis.io/docs/
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Database Overview'
# KPI Row
- title: Total Databases
hide_title: true
size: {w: 12, h: 5}
esql:
type: metric
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE attributes.db IS NOT NULL
| STATS databases = COUNT_DISTINCT(attributes.db)
primary:
field: databases
label: Databases
format:
type: number
decimals: 0
- title: Total Keys
hide_title: true
size: {w: 12, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.keys IS NOT NULL
| STATS keys = SUM(LAST_OVER_TIME(redis.db.keys)) BY db = attributes.db
| STATS keys = SUM(keys)
primary:
field: keys
label: Keys
format:
type: number
decimals: 0
- title: Total Keys with Expiry
hide_title: true
size: {w: 12, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.expires IS NOT NULL
| STATS expires = SUM(LAST_OVER_TIME(redis.db.expires)) BY db = attributes.db
| STATS expires = SUM(expires)
primary:
field: expires
label: Keys w/ TTL
format:
type: number
decimals: 0
- title: Avg TTL (All DBs)
hide_title: true
size: {w: 12, h: 5}
esql:
type: metric
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.avg_ttl IS NOT NULL
| STATS avg_ttl = AVG(LAST_OVER_TIME(redis.db.avg_ttl))
primary:
field: avg_ttl
label: Avg TTL (ms)
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Keys by Database'
# Key Count by Database (Bar Chart)
- title: Key Count by Database
size: {w: 24, h: 12}
esql:
type: bar
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.keys IS NOT NULL
| STATS keys = AVG(redis.db.keys) BY db = attributes.db
| SORT keys DESC
| LIMIT 16
legend:
visible: show
position: right
dimension:
field: db
metrics:
- field: keys
label: Keys
format:
type: number
# Keys Distribution (Pie Chart)
- title: Keys Distribution (Pie)
size: {w: 24, h: 12}
esql:
type: pie
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.keys IS NOT NULL
| STATS keys = AVG(redis.db.keys) BY db = attributes.db
| SORT keys DESC
| LIMIT 16
metrics:
- field: keys
label: Keys
format:
type: number
breakdowns:
- field: db
label: Database
appearance:
donut: medium
# Key Count Trend by Database (Gauge - use LAST_OVER_TIME for current state)
- title: Key Count Trend by Database
size: {w: 48, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.keys IS NOT NULL
| STATS keys = MAX(LAST_OVER_TIME(redis.db.keys))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), db = attributes.db
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: keys
label: Keys
format:
type: number
decimals: 0
breakdown:
field: db
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Keys with Expiration'
# Keys with TTL by Database (Bar Chart)
- title: Keys with TTL by Database
size: {w: 24, h: 12}
esql:
type: bar
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.expires IS NOT NULL
| STATS expires = AVG(redis.db.expires) BY db = attributes.db
| SORT expires DESC
| LIMIT 16
legend:
visible: show
position: right
dimension:
field: db
metrics:
- field: expires
label: Expires
format:
type: number
# Keys with TTL Trend (Gauge - use LAST_OVER_TIME for current state)
- title: Keys with TTL Trend
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.expires IS NOT NULL
| STATS expires = MAX(LAST_OVER_TIME(redis.db.expires))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), db = attributes.db
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: expires
label: Expires
format:
type: number
decimals: 0
breakdown:
field: db
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Average Time-To-Live (TTL)'
# Average TTL by Database (Bar Chart)
- title: Average TTL by Database
size: {w: 24, h: 12}
esql:
type: bar
query: |
FROM metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.avg_ttl IS NOT NULL
| STATS avg_ttl = AVG(redis.db.avg_ttl) BY db = attributes.db
| SORT avg_ttl DESC
| LIMIT 16
legend:
visible: show
position: right
dimension:
field: db
metrics:
- field: avg_ttl
label: Avg TTL
# Average TTL Trend by Database (Gauge - use LAST_OVER_TIME for current state)
- title: Average TTL Trend by Database
size: {w: 24, h: 12}
esql:
type: line
query: |
TS metrics-*
| WHERE data_stream.dataset == "redisreceiver.otel"
| WHERE redis.db.avg_ttl IS NOT NULL
| STATS avg_ttl = MAX(LAST_OVER_TIME(redis.db.avg_ttl))
BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), db = attributes.db
| SORT time_bucket ASC
legend:
visible: show
position: right
dimension:
field: time_bucket
label: Time
data_type: date
metrics:
- field: avg_ttl
label: Avg TTL
breakdown:
field: db
# Section Header
- size: {w: 48, h: 3}
markdown:
content: '## Database Summary Table'
# Database Metrics Summary Table
- title: Database Metrics Summary
size: {w: 48, h: 20}
esql:
type: datatable
query:
- TS metrics-*
- WHERE data_stream.dataset == "redisreceiver.otel"
- WHERE attributes.db IS NOT NULL
- STATS keys = MAX(LAST_OVER_TIME(redis.db.keys)), expires = MAX(LAST_OVER_TIME(redis.db.expires)), avg_ttl = MAX(LAST_OVER_TIME(redis.db.avg_ttl))
BY db = attributes.db
- EVAL expire_pct = expires / (keys + 0.000001)
- KEEP db, keys, expires, expire_pct, avg_ttl
- SORT keys DESC
- LIMIT 100
breakdowns:
- field: db
label: Database
- field: keys
label: Keys
- field: expires
label: Keys with TTL
- field: expire_pct
label: Expiry %
- field: avg_ttl
label: Avg TTL (ms)
Prerequisites¶
- Redis: Redis server instances (6.x or later recommended)
- OpenTelemetry Collector: Collector Contrib with Redis receiver configured
- Kibana: Version 8.x or later
Data Requirements¶
- Data stream dataset:
redisreceiver.otel - Data view:
metrics-*
OpenTelemetry Collector Configuration¶
receivers:
redis:
endpoint: localhost:6379
collection_interval: 10s
exporters:
elasticsearch:
endpoints: ["https://your-elasticsearch-instance:9200"]
service:
pipelines:
metrics:
receivers: [redis]
exporters: [elasticsearch]
Metrics Reference¶
Default Metrics¶
| Metric | Type | Unit | Description | Attributes |
|---|---|---|---|---|
redis.clients.blocked |
Sum | {client} |
Clients pending on a blocking call | — |
redis.clients.connected |
Sum | {client} |
Client connections (excluding replicas) | — |
redis.clients.max_input_buffer |
Gauge | By |
Largest input buffer among connections | — |
redis.clients.max_output_buffer |
Gauge | By |
Longest output list among connections | — |
redis.commands |
Gauge | {ops}/s |
Processed commands per second | — |
redis.commands.processed |
Sum | {command} |
Total server commands executed | — |
redis.connections.received |
Sum | {connection} |
Total accepted connections | — |
redis.connections.rejected |
Sum | {connection} |
Connections denied due to maxclients | — |
redis.cpu.time |
Sum | s |
CPU consumed since server start | state |
redis.db.avg_ttl |
Gauge | ms |
Average keyspace keys TTL | db |
redis.db.expires |
Gauge | {key} |
Keys with expiration in keyspace | db |
redis.db.keys |
Gauge | {key} |
Total keyspace keys | db |
redis.keys.evicted |
Sum | {key} |
Keys removed due to maxmemory limit | — |
redis.keys.expired |
Sum | {event} |
Total key expiration events | — |
redis.keyspace.hits |
Sum | {hit} |
Successful key lookups | — |
redis.keyspace.misses |
Sum | {miss} |
Failed key lookups | — |
redis.latest_fork |
Gauge | us |
Duration of most recent fork operation | — |
redis.memory.fragmentation_ratio |
Gauge | 1 |
Ratio between RSS and used memory | — |
redis.memory.lua |
Gauge | By |
Memory used by Lua engine | — |
redis.memory.peak |
Gauge | By |
Peak memory consumption | — |
redis.memory.rss |
Gauge | By |
Memory allocated as viewed by OS | — |
redis.memory.used |
Gauge | By |
Bytes allocated by Redis allocator | — |
redis.net.input |
Sum | By |
Total network bytes read | — |
redis.net.output |
Sum | By |
Total network bytes written | — |
redis.rdb.changes_since_last_save |
Sum | {change} |
Modifications since last dump | — |
redis.replication.backlog_first_byte_offset |
Gauge | By |
Master offset of replication backlog | — |
redis.replication.offset |
Gauge | By |
Server's current replication offset | — |
redis.slaves.connected |
Sum | {replica} |
Number of connected replicas | — |
redis.uptime |
Sum | s |
Seconds since server start | — |
Optional Metrics (disabled by default)¶
| Metric | Type | Unit | Description | Attributes |
|---|---|---|---|---|
redis.cmd.calls |
Sum | {call} |
Command execution call count | cmd |
redis.cmd.latency |
Gauge | s |
Command execution latency | cmd, percentile |
redis.cmd.usec |
Sum | us |
Total command execution time | cmd |
redis.maxmemory |
Gauge | By |
Configured maximum memory limit | — |
redis.role |
Sum | {role} |
Node's operational role | role |
Metric Attributes¶
| Attribute | Values | Description |
|---|---|---|
state |
sys, sys_children, sys_main_thread, user, user_children, user_main_thread |
CPU state |
db |
db0, db1, etc. |
Database index |
cmd |
Command name | Redis command |
percentile |
p50, p99, p99.9 |
Latency percentile |
role |
replica, primary |
Node role |
Resource Attributes¶
| Attribute | Description |
|---|---|
redis.version |
Server version identifier |
server.address |
Server address (optional) |
server.port |
Server port (optional) |
Metrics Not Used in Dashboards¶
All optional metrics listed in the Optional Metrics section above are not currently visualized in the dashboards. These metrics are disabled by default and must be explicitly enabled in the OpenTelemetry Collector configuration.