Skip to content

Troubleshooting

Use this page when FastForward is running but results are wrong, incomplete, or unstable. Start with the symptom table, run the exact checks, and compare expected output before changing config.

SymptomFirst checkExpected outputIf not expected
No logs arrive at destination`curl -s http://localhost:9090/admin/v1/statusjq ‘.pipelines[0].inputs’`lines_total increasing
Logs read, but nothing forwarded`curl -s http://localhost:9090/admin/v1/statusjq ‘.pipelines[0].transform’`lines_in > 0 and lines_out > 0
Frequent OTLP send errorsCheck runtime logs for error sendingNo repeated connection/auth errorsFix endpoint/protocol/connectivity
Startup/config errorsff validate --config config.yamlOutput contains config ok:Fix required fields / YAML syntax
Throughput unexpectedly low`curl -s http://localhost:9090/admin/v1/statusjq ‘.pipelines[0].stage_seconds’`output not dominating total
Terminal window
# 1) Are inputs being read?
curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].inputs'
# 2) Are output counters increasing?
curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].outputs'
  • inputs[*].lines_total increases over time.
  • outputs[*].lines_total increases over time.
  1. Wrong file glob.
    • Fix input.path and validate the directory exists inside the container/pod.
  2. Missing host mount.
    • Ensure /var/log is mounted read-only in Docker/Kubernetes.
  3. Permission denied on log files.
    • Run with access to host log files and verify container security context.

Run the two checks again and confirm both input and output counters increase.

Scenario 2: Inputs increase, but outputs remain zero

Section titled “Scenario 2: Inputs increase, but outputs remain zero”
Terminal window
curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].transform'
  • lines_in and lines_out are both greater than zero.
  • filter_drop_rate is not near 1.0.
  1. SQL WHERE clause filters everything.
    • Temporarily remove WHERE to confirm data flow.
  2. Case mismatch in filters.
    • Example: level = 'error' vs actual ERROR.
  3. Wrong field names.
    • Verify selected columns exist in parsed records.

lines_out should increase within a few seconds under load.

Scenario 3: OTLP send failures (connection refused, timeouts, auth)

Section titled “Scenario 3: OTLP send failures (connection refused, timeouts, auth)”
Terminal window
# HTTP OTLP health check (local)
curl -X POST http://localhost:4318/v1/logs \
-H 'Content-Type: application/json' \
-d '{}'
# HTTP OTLP health check (Kubernetes — replace otel-collector with your namespace/service)
curl -X POST http://otel-collector:4318/v1/logs \
-H 'Content-Type: application/json' \
-d '{}'
# Kubernetes DNS resolution check
POD=$(kubectl -n collectors get pods -l app=ffwd -o jsonpath='{.items[0].metadata.name}')
kubectl -n collectors exec "$POD" -- nslookup otel-collector
  • DNS resolves collector service name.
  • Endpoint responds (even non-200 can prove reachability).
  • Runtime logs stop repeating send errors.
  1. Protocol/port mismatch.
    • gRPC uses 4317, HTTP uses 4318.
  2. Wrong namespace-qualified service name.
    • Use full in-cluster DNS name when needed.
  3. Network policy blocking egress.
    • Allow traffic from the collectors namespace to the collector service.

outputs[*].errors stops increasing and outputs[*].lines_total resumes growing.

Scenario 4: Startup fails with config validation errors

Section titled “Scenario 4: Startup fails with config validation errors”
Terminal window
ff validate --config config.yaml
ready: default
config ok: 1 pipeline(s)

Exit code 0 on success, 1 on configuration error.

  • Missing input.path for file input.
  • Missing endpoint for otlp/http/elasticsearch/loki outputs.
  • Missing the required pipelines map or using the wrong nesting for inputs / outputs.
  • YAML scalar mistakes in SQL.

Use block scalar syntax for SQL:

transform: |
SELECT level, message
FROM logs
WHERE level = 'ERROR'

Validation passes, then dry-run succeeds:

Terminal window
ff dry-run --config config.yaml

Scenario 5: Throughput drops or latency spikes

Section titled “Scenario 5: Throughput drops or latency spikes”
Terminal window
curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].stage_seconds'

Stage times should be stable, with no sudden sustained growth in output time.

  1. Output stage dominates.
    • Move collector closer, enable compression (compression: zstd), or scale collector.
  2. Excessive transform complexity.
    • Simplify query or split into named pipelines.
  3. Node resource pressure.
    • Increase CPU/memory requests for the ffwd DaemonSet.

Observe reduced output stage time and stable forwarding counters.

If you need immediate stabilization while investigating:

  1. Remove complex transform filters temporarily.
  2. Send to stdout in a non-production environment to confirm parse path.
  3. Re-enable OTLP once counters and errors look healthy.

This narrows failure scope without changing input collection semantics.

Terminal window
# Health and readiness
curl -s http://localhost:9090/live | jq .
curl -s http://localhost:9090/ready | jq .
# End-to-end pipeline stats
curl -s http://localhost:9090/admin/v1/status | jq .
# Flattened stats snapshot
curl -s http://localhost:9090/admin/v1/stats | jq .
TopicWhere to go
Check pipeline metricsMonitoring & Diagnostics
Review your configYAML Reference
Understand the pipelinePipeline Explorer (interactive)