Skip to content

Redis OpenTelemetry Dashboards

Redis database monitoring dashboards using OpenTelemetry Redis receiver metrics.

Overview

These dashboards provide comprehensive monitoring for Redis instances, including client connections, memory usage, command throughput, keyspace statistics, and replication metrics.

Dashboards

Dashboard File Description
Overview overview.yaml Multi-instance monitoring with key metrics across all Redis instances
Instance Details instance-details.yaml Detailed single-instance analysis including memory, connections, keyspace, and replication metrics
Database Metrics database-metrics.yaml Per-database keyspace metrics including keys, TTL, and expiration statistics

All dashboards include navigation links for easy switching between views.

Dashboard Definitions

Overview (overview.yaml)
---
# Redis OpenTelemetry Overview Dashboard (ES|QL Version)
# Provides cluster-level metrics and instance health monitoring
#
# Metrics reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver
# Requires metrics-* with data_stream.dataset == "redisreceiver.otel"
dashboards:
  - id: redis-overview
    name: '[Metrics Redis] Overview'
    description: OpenTelemetry Redis Receiver - Overview dashboard showing key metrics across all Redis instances (ES|QL)
    filters:
      - field: data_stream.dataset
        equals: redisreceiver.otel
    controls:
      - type: options
        id: instance-filter
        label: Instance
        data_view: metrics-*
        field: resource.attributes.service.instance.id
    panels:
      # Navigation Links
      - size: {w: 48, h: 2}
        links:
          layout: horizontal
          items:
            - label: Overview
              dashboard: redis-overview
            - label: Instance Details
              dashboard: redis-instance-details
            - label: Database Metrics
              dashboard: redis-database-metrics
            - label: Redis Documentation
              url: https://redis.io/docs/

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Redis Instance Overview'

      # KPI Metrics Row
      - title: Total Redis Instances
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE resource.attributes.service.instance.id IS NOT NULL
            | STATS instances = COUNT_DISTINCT(resource.attributes.service.instance.id)
          primary:
            field: instances
            label: Instances
            format:
              type: number
              decimals: 0
      - title: Total Client Connections
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.clients.connected IS NOT NULL
            | STATS clients = SUM(LAST_OVER_TIME(redis.clients.connected))
              BY instance = resource.attributes.service.instance.id
            | STATS clients = SUM(clients)
          primary:
            field: clients
            label: Clients
            format:
              type: number
              decimals: 0
      - title: Commands/sec
        hide_title: true
        size: {w: 8, h: 5}
        description: Gauge metric showing instantaneous commands per second
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.commands IS NOT NULL
            | STATS commands_per_sec = SUM(LAST_OVER_TIME(redis.commands))
              BY instance = resource.attributes.service.instance.id
            | STATS commands_per_sec = SUM(commands_per_sec)
          primary:
            field: commands_per_sec
            label: Commands/sec
      - title: Cache Hit Rate
        description: >-
          Hits / (hits + misses). Values <80% indicate cache misses increasing
          backend load.
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
            | STATS hits = SUM(RATE(redis.keyspace.hits)), misses = SUM(RATE(redis.keyspace.misses))
            | EVAL hit_rate = hits / (hits + misses + 0.000001)
            | KEEP hit_rate
          primary:
            field: hit_rate
            label: Hit Rate
            format:
              type: percent
      - title: Total Memory Usage
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.used IS NOT NULL
            | STATS memory = SUM(LAST_OVER_TIME(redis.memory.used))
              BY instance = resource.attributes.service.instance.id
            | STATS memory = SUM(memory)
          primary:
            field: memory
            label: Memory
            format:
              type: bytes
      - title: Avg Memory Fragmentation
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.fragmentation_ratio IS NOT NULL
            | STATS fragmentation = AVG(LAST_OVER_TIME(redis.memory.fragmentation_ratio))
          primary:
            field: fragmentation
            label: Fragmentation

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Memory & Performance Trends'

      # Memory Usage by Instance (Gauge - use LAST_OVER_TIME for current state)
      - title: Memory Usage by Instance
        size: {w: 24, h: 12}
        esql:
          type: area
          mode: stacked
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.used IS NOT NULL
            | STATS memory = MAX(LAST_OVER_TIME(redis.memory.used))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: memory
              label: Memory Used
              format:
                type: bytes
          breakdown:
            field: instance

      # Commands Per Second (Gauge metric - use LAST_OVER_TIME for current state)
      - title: Commands Per Second
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.commands IS NOT NULL
            | STATS commands = MAX(LAST_OVER_TIME(redis.commands))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: commands
              label: Commands/sec
          breakdown:
            field: instance

      # Client Connections Over Time (Gauge - use LAST_OVER_TIME for current state)
      - title: Client Connections Over Time
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.clients.connected IS NOT NULL
            | STATS clients = MAX(LAST_OVER_TIME(redis.clients.connected))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: clients
              label: Connected Clients
              format:
                type: number
                decimals: 0
          breakdown:
            field: instance

      # Keyspace Hit/Miss Rate (Counter metrics - use RATE)
      - title: Keyspace Hit/Miss Rate
        size: {w: 24, h: 12}
        esql:
          type: area
          mode: stacked
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
            | STATS hits = SUM(RATE(redis.keyspace.hits)), misses = SUM(RATE(redis.keyspace.misses))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: hits
              label: Hits/sec
            - field: misses
              label: Misses/sec

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Key Eviction & Expiration'

      # Keys Evicted Rate (Counter - use RATE)
      - title: Keys Evicted Rate
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keys.evicted IS NOT NULL
            | STATS evicted = SUM(RATE(redis.keys.evicted))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: evicted
              label: Evicted/sec
          breakdown:
            field: instance

      # Keys Expired Rate (Counter - use RATE)
      - title: Keys Expired Rate
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keys.expired IS NOT NULL
            | STATS expired = SUM(RATE(redis.keys.expired))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: expired
              label: Expired/sec
          breakdown:
            field: instance

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Network I/O'

      # Network Input (Counter - use RATE)
      - title: Network Input (Bytes Received)
        size: {w: 24, h: 12}
        esql:
          type: area
          mode: stacked
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.net.input IS NOT NULL
            | STATS input = SUM(RATE(redis.net.input))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: input
              label: Input
              format:
                type: bytes
          breakdown:
            field: instance

      # Network Output (Counter - use RATE)
      - title: Network Output (Bytes Sent)
        size: {w: 24, h: 12}
        esql:
          type: area
          mode: stacked
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.net.output IS NOT NULL
            | STATS output = SUM(RATE(redis.net.output))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), instance = resource.attributes.service.instance.id
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: output
              label: Output
              format:
                type: bytes
          breakdown:
            field: instance

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Instance Summary Table'

      # Instance Summary DataTable
      - title: Redis Instance Metrics Summary
        size: {w: 48, h: 20}
        esql:
          type: datatable
          query:
            - TS metrics-*
            - WHERE data_stream.dataset == "redisreceiver.otel"
            - WHERE resource.attributes.service.instance.id IS NOT NULL
            - STATS memory = MAX(LAST_OVER_TIME(redis.memory.used)), clients = MAX(LAST_OVER_TIME(redis.clients.connected)), commands = MAX(LAST_OVER_TIME(redis.commands)),
              fragmentation = MAX(LAST_OVER_TIME(redis.memory.fragmentation_ratio)) BY instance = resource.attributes.service.instance.id
            - KEEP instance, memory, clients, commands, fragmentation
            - SORT memory DESC
            - LIMIT 100
          breakdowns:
            - field: instance
              label: Instance
            - field: memory
              label: Memory Used
            - field: clients
              label: Clients
            - field: commands
              label: Commands/sec
            - field: fragmentation
              label: Fragmentation Ratio
Instance Details (instance-details.yaml)
---
# Redis OpenTelemetry Instance Details Dashboard (ES|QL Version)
# Detailed metrics for a specific Redis instance
#
# Metrics reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver
# Requires metrics-* with data_stream.dataset == "redisreceiver.otel"
dashboards:
  - id: redis-instance-details
    name: '[Metrics Redis] Instance Details'
    description: OpenTelemetry Redis Receiver - Detailed metrics for a specific Redis instance (ES|QL)
    filters:
      - field: data_stream.dataset
        equals: redisreceiver.otel
    controls:
      - type: options
        id: instance-filter
        label: Instance
        data_view: metrics-*
        field: resource.attributes.service.instance.id
    panels:
      # Navigation Links
      - size: {w: 48, h: 2}
        links:
          layout: horizontal
          items:
            - label: Overview
              dashboard: redis-overview
            - label: Instance Details
              dashboard: redis-instance-details
            - label: Database Metrics
              dashboard: redis-database-metrics
            - label: Redis Documentation
              url: https://redis.io/docs/

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Redis Instance Health'

      # Health KPI Row
      - title: Uptime
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.uptime IS NOT NULL
            | STATS uptime = MAX(TO_DOUBLE(redis.uptime))
          primary:
            field: uptime
            label: Uptime (s)
      - title: Connected Clients
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.clients.connected IS NOT NULL
            | STATS clients = MAX(LAST_OVER_TIME(redis.clients.connected))
          primary:
            field: clients
            label: Clients
            format:
              type: number
              decimals: 0
      - title: Blocked Clients
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.clients.blocked IS NOT NULL
            | STATS blocked = MAX(LAST_OVER_TIME(redis.clients.blocked))
          primary:
            field: blocked
            label: Blocked
            format:
              type: number
              decimals: 0
      - title: Commands/sec
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.commands IS NOT NULL
            | STATS commands = MAX(LAST_OVER_TIME(redis.commands))
          primary:
            field: commands
            label: Cmd/sec
      - title: Connected Replicas
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.slaves.connected IS NOT NULL
            | STATS replicas = MAX(LAST_OVER_TIME(redis.slaves.connected))
          primary:
            field: replicas
            label: Replicas
            format:
              type: number
              decimals: 0
      - title: Cache Hit Rate
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
            | STATS hits = SUM(RATE(redis.keyspace.hits)), misses = SUM(RATE(redis.keyspace.misses))
            | EVAL hit_rate = hits / (hits + misses + 0.000001)
            | KEEP hit_rate
          primary:
            field: hit_rate
            label: Hit Rate
            format:
              type: percent

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Memory Analysis'

      # Memory KPI Row
      - title: Memory Used
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.used IS NOT NULL
            | STATS memory = MAX(LAST_OVER_TIME(redis.memory.used))
          primary:
            field: memory
            label: Used
            format:
              type: bytes
      - title: Memory Peak
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.peak IS NOT NULL
            | STATS peak = MAX(LAST_OVER_TIME(redis.memory.peak))
          primary:
            field: peak
            label: Peak
            format:
              type: bytes
      - title: Memory RSS
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.rss IS NOT NULL
            | STATS rss = MAX(LAST_OVER_TIME(redis.memory.rss))
          primary:
            field: rss
            label: RSS
            format:
              type: bytes
      - title: Fragmentation Ratio
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.fragmentation_ratio IS NOT NULL
            | STATS fragmentation = MAX(LAST_OVER_TIME(redis.memory.fragmentation_ratio))
          primary:
            field: fragmentation
            label: Fragmentation
      - title: Lua Memory
        hide_title: true
        size: {w: 8, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.lua IS NOT NULL
            | STATS lua = MAX(LAST_OVER_TIME(redis.memory.lua))
          primary:
            field: lua
            label: Lua
            format:
              type: bytes

      # Memory Usage Trend (Gauge metrics - use LAST_OVER_TIME for current state)
      - title: Memory Usage Trend
        size: {w: 48, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.used IS NOT NULL OR redis.memory.rss IS NOT NULL OR redis.memory.peak IS NOT NULL
            | STATS used = MAX(LAST_OVER_TIME(redis.memory.used)),
              rss = MAX(LAST_OVER_TIME(redis.memory.rss)),
              peak = MAX(LAST_OVER_TIME(redis.memory.peak))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: used
              label: Used Memory
              format:
                type: bytes
            - field: rss
              label: RSS
              format:
                type: bytes
            - field: peak
              label: Peak
              format:
                type: bytes

      # Memory Fragmentation Over Time
      - title: Memory Fragmentation Over Time
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.fragmentation_ratio IS NOT NULL
            | STATS fragmentation = MAX(LAST_OVER_TIME(redis.memory.fragmentation_ratio))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: fragmentation
              label: Fragmentation Ratio

      # Lua Script Memory Usage
      - title: Lua Script Memory Usage
        size: {w: 24, h: 12}
        esql:
          type: area
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.memory.lua IS NOT NULL
            | STATS lua = MAX(LAST_OVER_TIME(redis.memory.lua))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: lua
              label: Lua Memory
              format:
                type: bytes

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Client Connections & Performance'

      # Client Connections Over Time
      - title: Client Connections Over Time
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.clients.connected IS NOT NULL OR redis.clients.blocked IS NOT NULL
            | STATS connected = MAX(LAST_OVER_TIME(redis.clients.connected)),
              blocked = MAX(LAST_OVER_TIME(redis.clients.blocked))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: connected
              label: Connected
              format:
                type: number
                decimals: 0
            - field: blocked
              label: Blocked
              format:
                type: number
                decimals: 0

      # Connection Activity (Counters - use RATE)
      - title: Connection Activity
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.connections.received IS NOT NULL OR redis.connections.rejected IS NOT NULL
            | STATS received = SUM(RATE(redis.connections.received)),
              rejected = SUM(RATE(redis.connections.rejected))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: received
              label: Received/sec
            - field: rejected
              label: Rejected/sec

      # Client Buffer Sizes (Gauge metrics)
      - title: Client Buffer Sizes
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.clients.max_input_buffer IS NOT NULL OR redis.clients.max_output_buffer IS NOT NULL
            | STATS max_input = MAX(MAX_OVER_TIME(redis.clients.max_input_buffer)),
              max_output = MAX(MAX_OVER_TIME(redis.clients.max_output_buffer))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: max_input
              label: Max Input Buffer
              format:
                type: bytes
            - field: max_output
              label: Max Output Buffer
              format:
                type: bytes

      # Commands Processed Rate (Counter - use RATE)
      - title: Commands Processed Rate
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.commands.processed IS NOT NULL
            | STATS commands = SUM(RATE(redis.commands.processed))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: commands
              label: Commands/sec

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Keyspace Operations'

      # Keyspace Hit/Miss Ratio (Counter - use RATE)
      - title: Keyspace Hit/Miss Ratio
        size: {w: 24, h: 12}
        esql:
          type: area
          mode: stacked
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
            | STATS hits = SUM(RATE(redis.keyspace.hits)),
              misses = SUM(RATE(redis.keyspace.misses))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: hits
              label: Hits/sec
            - field: misses
              label: Misses/sec

      # Cache Hit Rate Over Time
      - title: Cache Hit Rate Over Time
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keyspace.hits IS NOT NULL OR redis.keyspace.misses IS NOT NULL
            | STATS hits = SUM(RATE(redis.keyspace.hits)),
              misses = SUM(RATE(redis.keyspace.misses))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | EVAL hit_rate = hits / (hits + misses + 0.000001)
            | KEEP time_bucket, hit_rate
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: hit_rate
              label: Hit Rate
              format:
                type: percent

      # Key Eviction & Expiration (Counters - use RATE)
      - title: Key Eviction & Expiration
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.keys.evicted IS NOT NULL OR redis.keys.expired IS NOT NULL
            | STATS evicted = SUM(RATE(redis.keys.evicted)),
              expired = SUM(RATE(redis.keys.expired))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: evicted
              label: Evicted/sec
            - field: expired
              label: Expired/sec

      # RDB Changes Since Last Save
      - title: RDB Changes Since Last Save
        size: {w: 24, h: 12}
        esql:
          type: area
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.rdb.changes_since_last_save IS NOT NULL
            | STATS changes = MAX(MAX_OVER_TIME(redis.rdb.changes_since_last_save))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: changes
              label: Unsaved Changes

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Network I/O'

      # Network Traffic (Counters - use RATE)
      - title: Network Traffic
        size: {w: 48, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.net.input IS NOT NULL OR redis.net.output IS NOT NULL
            | STATS input = SUM(RATE(redis.net.input)),
              output = SUM(RATE(redis.net.output))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: input
              label: Input (bytes/sec)
              format:
                type: bytes
            - field: output
              label: Output (bytes/sec)
              format:
                type: bytes

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## CPU Usage'

      # CPU Time by Type (Counter with attribute - use RATE)
      - title: CPU Time by Type
        size: {w: 48, h: 12}
        esql:
          type: area
          mode: stacked
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.cpu.time IS NOT NULL
            | STATS cpu_time = SUM(RATE(redis.cpu.time))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), state = attributes.state
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: cpu_time
              label: CPU Time
          breakdown:
            field: state

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Replication'

      # Replication Offset (Gauge)
      - title: Replication Offset
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.replication.offset IS NOT NULL
            | STATS offset = MAX(LAST_OVER_TIME(redis.replication.offset))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: offset
              label: Replication Offset
              format:
                type: bytes

      # Replication Backlog (Gauge)
      - title: Replication Backlog
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.replication.backlog_first_byte_offset IS NOT NULL
            | STATS backlog = MAX(LAST_OVER_TIME(redis.replication.backlog_first_byte_offset))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: backlog
              label: Backlog Offset
              format:
                type: bytes

      # Connected Replicas Over Time
      - title: Connected Replicas Over Time
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.slaves.connected IS NOT NULL
            | STATS replicas = MAX(LAST_OVER_TIME(redis.slaves.connected))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: replicas
              label: Replicas
              format:
                type: number
                decimals: 0

      # Fork Duration (Gauge)
      - title: Fork Duration
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.latest_fork IS NOT NULL
            | STATS fork_duration = MAX(MAX_OVER_TIME(redis.latest_fork))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend)
            | SORT time_bucket ASC
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: fork_duration
              label: Fork Duration (μs)
Database Metrics (database-metrics.yaml)
---
# Redis OpenTelemetry Database Metrics Dashboard (ES|QL Version)
# Per-database keyspace metrics
#
# Metrics reference: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/redisreceiver
# Requires metrics-* with data_stream.dataset == "redisreceiver.otel"
dashboards:
  - id: redis-database-metrics
    name: '[Metrics Redis] Database Metrics'
    description: OpenTelemetry Redis Receiver - Per-database keyspace metrics (ES|QL)
    filters:
      - field: data_stream.dataset
        equals: redisreceiver.otel
    controls:
      - type: options
        id: instance-filter
        label: Instance
        data_view: metrics-*
        field: resource.attributes.service.instance.id
      - type: options
        id: db-filter
        label: Database
        data_view: metrics-*
        field: attributes.db
        width: small
    panels:
      # Navigation Links
      - size: {w: 48, h: 2}
        links:
          layout: horizontal
          items:
            - label: Overview
              dashboard: redis-overview
            - label: Instance Details
              dashboard: redis-instance-details
            - label: Database Metrics
              dashboard: redis-database-metrics
            - label: Redis Documentation
              url: https://redis.io/docs/

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Database Overview'

      # KPI Row
      - title: Total Databases
        hide_title: true
        size: {w: 12, h: 5}
        esql:
          type: metric
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE attributes.db IS NOT NULL
            | STATS databases = COUNT_DISTINCT(attributes.db)
          primary:
            field: databases
            label: Databases
            format:
              type: number
              decimals: 0
      - title: Total Keys
        hide_title: true
        size: {w: 12, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.keys IS NOT NULL
            | STATS keys = SUM(LAST_OVER_TIME(redis.db.keys)) BY db = attributes.db
            | STATS keys = SUM(keys)
          primary:
            field: keys
            label: Keys
            format:
              type: number
              decimals: 0
      - title: Total Keys with Expiry
        hide_title: true
        size: {w: 12, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.expires IS NOT NULL
            | STATS expires = SUM(LAST_OVER_TIME(redis.db.expires)) BY db = attributes.db
            | STATS expires = SUM(expires)
          primary:
            field: expires
            label: Keys w/ TTL
            format:
              type: number
              decimals: 0
      - title: Avg TTL (All DBs)
        hide_title: true
        size: {w: 12, h: 5}
        esql:
          type: metric
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.avg_ttl IS NOT NULL
            | STATS avg_ttl = AVG(LAST_OVER_TIME(redis.db.avg_ttl))
          primary:
            field: avg_ttl
            label: Avg TTL (ms)

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Keys by Database'

      # Key Count by Database (Bar Chart)
      - title: Key Count by Database
        size: {w: 24, h: 12}
        esql:
          type: bar
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.keys IS NOT NULL
            | STATS keys = AVG(redis.db.keys) BY db = attributes.db
            | SORT keys DESC
            | LIMIT 16
          legend:
            visible: show
            position: right
          dimension:
            field: db
          metrics:
            - field: keys
              label: Keys
              format:
                type: number

      # Keys Distribution (Pie Chart)
      - title: Keys Distribution (Pie)
        size: {w: 24, h: 12}
        esql:
          type: pie
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.keys IS NOT NULL
            | STATS keys = AVG(redis.db.keys) BY db = attributes.db
            | SORT keys DESC
            | LIMIT 16
          metrics:
            - field: keys
              label: Keys
              format:
                type: number
          breakdowns:
            - field: db
              label: Database
          appearance:
            donut: medium

      # Key Count Trend by Database (Gauge - use LAST_OVER_TIME for current state)
      - title: Key Count Trend by Database
        size: {w: 48, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.keys IS NOT NULL
            | STATS keys = MAX(LAST_OVER_TIME(redis.db.keys))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), db = attributes.db
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: keys
              label: Keys
              format:
                type: number
                decimals: 0
          breakdown:
            field: db

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Keys with Expiration'

      # Keys with TTL by Database (Bar Chart)
      - title: Keys with TTL by Database
        size: {w: 24, h: 12}
        esql:
          type: bar
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.expires IS NOT NULL
            | STATS expires = AVG(redis.db.expires) BY db = attributes.db
            | SORT expires DESC
            | LIMIT 16
          legend:
            visible: show
            position: right
          dimension:
            field: db
          metrics:
            - field: expires
              label: Expires
              format:
                type: number

      # Keys with TTL Trend (Gauge - use LAST_OVER_TIME for current state)
      - title: Keys with TTL Trend
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.expires IS NOT NULL
            | STATS expires = MAX(LAST_OVER_TIME(redis.db.expires))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), db = attributes.db
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: expires
              label: Expires
              format:
                type: number
                decimals: 0
          breakdown:
            field: db

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Average Time-To-Live (TTL)'

      # Average TTL by Database (Bar Chart)
      - title: Average TTL by Database
        size: {w: 24, h: 12}
        esql:
          type: bar
          query: |
            FROM metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.avg_ttl IS NOT NULL
            | STATS avg_ttl = AVG(redis.db.avg_ttl) BY db = attributes.db
            | SORT avg_ttl DESC
            | LIMIT 16
          legend:
            visible: show
            position: right
          dimension:
            field: db
          metrics:
            - field: avg_ttl
              label: Avg TTL

      # Average TTL Trend by Database (Gauge - use LAST_OVER_TIME for current state)
      - title: Average TTL Trend by Database
        size: {w: 24, h: 12}
        esql:
          type: line
          query: |
            TS metrics-*
            | WHERE data_stream.dataset == "redisreceiver.otel"
            | WHERE redis.db.avg_ttl IS NOT NULL
            | STATS avg_ttl = MAX(LAST_OVER_TIME(redis.db.avg_ttl))
              BY time_bucket = BUCKET(@timestamp, 20, ?_tstart, ?_tend), db = attributes.db
            | SORT time_bucket ASC
          legend:
            visible: show
            position: right
          dimension:
            field: time_bucket
            label: Time
            data_type: date
          metrics:
            - field: avg_ttl
              label: Avg TTL
          breakdown:
            field: db

      # Section Header
      - size: {w: 48, h: 3}
        markdown:
          content: '## Database Summary Table'

      # Database Metrics Summary Table
      - title: Database Metrics Summary
        size: {w: 48, h: 20}
        esql:
          type: datatable
          query:
            - TS metrics-*
            - WHERE data_stream.dataset == "redisreceiver.otel"
            - WHERE attributes.db IS NOT NULL
            - STATS keys = MAX(LAST_OVER_TIME(redis.db.keys)), expires = MAX(LAST_OVER_TIME(redis.db.expires)), avg_ttl = MAX(LAST_OVER_TIME(redis.db.avg_ttl))
              BY db = attributes.db
            - EVAL expire_pct = expires / (keys + 0.000001)
            - KEEP db, keys, expires, expire_pct, avg_ttl
            - SORT keys DESC
            - LIMIT 100
          breakdowns:
            - field: db
              label: Database
            - field: keys
              label: Keys
            - field: expires
              label: Keys with TTL
            - field: expire_pct
              label: Expiry %
            - field: avg_ttl
              label: Avg TTL (ms)

Prerequisites

  • Redis: Redis server instances (6.x or later recommended)
  • OpenTelemetry Collector: Collector Contrib with Redis receiver configured
  • Kibana: Version 8.x or later

Data Requirements

  • Data stream dataset: redisreceiver.otel
  • Data view: metrics-*

OpenTelemetry Collector Configuration

receivers:
  redis:
    endpoint: localhost:6379
    collection_interval: 10s

exporters:
  elasticsearch:
    endpoints: ["https://your-elasticsearch-instance:9200"]

service:
  pipelines:
    metrics:
      receivers: [redis]
      exporters: [elasticsearch]

Metrics Reference

Default Metrics

Metric Type Unit Description Attributes
redis.clients.blocked Sum {client} Clients pending on a blocking call
redis.clients.connected Sum {client} Client connections (excluding replicas)
redis.clients.max_input_buffer Gauge By Largest input buffer among connections
redis.clients.max_output_buffer Gauge By Longest output list among connections
redis.commands Gauge {ops}/s Processed commands per second
redis.commands.processed Sum {command} Total server commands executed
redis.connections.received Sum {connection} Total accepted connections
redis.connections.rejected Sum {connection} Connections denied due to maxclients
redis.cpu.time Sum s CPU consumed since server start state
redis.db.avg_ttl Gauge ms Average keyspace keys TTL db
redis.db.expires Gauge {key} Keys with expiration in keyspace db
redis.db.keys Gauge {key} Total keyspace keys db
redis.keys.evicted Sum {key} Keys removed due to maxmemory limit
redis.keys.expired Sum {event} Total key expiration events
redis.keyspace.hits Sum {hit} Successful key lookups
redis.keyspace.misses Sum {miss} Failed key lookups
redis.latest_fork Gauge us Duration of most recent fork operation
redis.memory.fragmentation_ratio Gauge 1 Ratio between RSS and used memory
redis.memory.lua Gauge By Memory used by Lua engine
redis.memory.peak Gauge By Peak memory consumption
redis.memory.rss Gauge By Memory allocated as viewed by OS
redis.memory.used Gauge By Bytes allocated by Redis allocator
redis.net.input Sum By Total network bytes read
redis.net.output Sum By Total network bytes written
redis.rdb.changes_since_last_save Sum {change} Modifications since last dump
redis.replication.backlog_first_byte_offset Gauge By Master offset of replication backlog
redis.replication.offset Gauge By Server's current replication offset
redis.slaves.connected Sum {replica} Number of connected replicas
redis.uptime Sum s Seconds since server start

Optional Metrics (disabled by default)

Metric Type Unit Description Attributes
redis.cmd.calls Sum {call} Command execution call count cmd
redis.cmd.latency Gauge s Command execution latency cmd, percentile
redis.cmd.usec Sum us Total command execution time cmd
redis.maxmemory Gauge By Configured maximum memory limit
redis.role Sum {role} Node's operational role role

Metric Attributes

Attribute Values Description
state sys, sys_children, sys_main_thread, user, user_children, user_main_thread CPU state
db db0, db1, etc. Database index
cmd Command name Redis command
percentile p50, p99, p99.9 Latency percentile
role replica, primary Node role

Resource Attributes

Attribute Description
redis.version Server version identifier
server.address Server address (optional)
server.port Server port (optional)

Metrics Not Used in Dashboards

All optional metrics listed in the Optional Metrics section above are not currently visualized in the dashboards. These metrics are disabled by default and must be explicitly enabled in the OpenTelemetry Collector configuration.