Troubleshooting
Use this page when FastForward is running but results are wrong, incomplete, or unstable. Start with the symptom table, run the exact checks, and compare expected output before changing config.
Symptom-first triage
Section titled “Symptom-first triage”| Symptom | First check | Expected output | If not expected |
|---|---|---|---|
| No logs arrive at destination | `curl -s http://localhost:9090/admin/v1/status | jq ‘.pipelines[0].inputs’` | lines_total increasing |
| Logs read, but nothing forwarded | `curl -s http://localhost:9090/admin/v1/status | jq ‘.pipelines[0].transform’` | lines_in > 0 and lines_out > 0 |
| Frequent OTLP send errors | Check runtime logs for error sending | No repeated connection/auth errors | Fix endpoint/protocol/connectivity |
| Startup/config errors | ff validate --config config.yaml | Output contains config ok: | Fix required fields / YAML syntax |
| Throughput unexpectedly low | `curl -s http://localhost:9090/admin/v1/status | jq ‘.pipelines[0].stage_seconds’` | output not dominating total |
Scenario 1: No logs arrive at destination
Section titled “Scenario 1: No logs arrive at destination”Checks
Section titled “Checks”# 1) Are inputs being read?curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].inputs'
# 2) Are output counters increasing?curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].outputs'Expected
Section titled “Expected”inputs[*].lines_totalincreases over time.outputs[*].lines_totalincreases over time.
Common causes and fixes
Section titled “Common causes and fixes”- Wrong file glob.
- Fix
input.pathand validate the directory exists inside the container/pod.
- Fix
- Missing host mount.
- Ensure
/var/logis mounted read-only in Docker/Kubernetes.
- Ensure
- Permission denied on log files.
- Run with access to host log files and verify container security context.
Verify fix
Section titled “Verify fix”Run the two checks again and confirm both input and output counters increase.
Scenario 2: Inputs increase, but outputs remain zero
Section titled “Scenario 2: Inputs increase, but outputs remain zero”Checks
Section titled “Checks”curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].transform'Expected
Section titled “Expected”lines_inandlines_outare both greater than zero.filter_drop_rateis not near1.0.
Common causes and fixes
Section titled “Common causes and fixes”- SQL
WHEREclause filters everything.- Temporarily remove
WHEREto confirm data flow.
- Temporarily remove
- Case mismatch in filters.
- Example:
level = 'error'vs actualERROR.
- Example:
- Wrong field names.
- Verify selected columns exist in parsed records.
Verify fix
Section titled “Verify fix”lines_out should increase within a few seconds under load.
Scenario 3: OTLP send failures (connection refused, timeouts, auth)
Section titled “Scenario 3: OTLP send failures (connection refused, timeouts, auth)”Checks
Section titled “Checks”# HTTP OTLP health check (local)curl -X POST http://localhost:4318/v1/logs \ -H 'Content-Type: application/json' \ -d '{}'
# HTTP OTLP health check (Kubernetes — replace otel-collector with your namespace/service)curl -X POST http://otel-collector:4318/v1/logs \ -H 'Content-Type: application/json' \ -d '{}'
# Kubernetes DNS resolution checkPOD=$(kubectl -n collectors get pods -l app=ffwd -o jsonpath='{.items[0].metadata.name}')kubectl -n collectors exec "$POD" -- nslookup otel-collectorExpected
Section titled “Expected”- DNS resolves collector service name.
- Endpoint responds (even non-200 can prove reachability).
- Runtime logs stop repeating send errors.
Common causes and fixes
Section titled “Common causes and fixes”- Protocol/port mismatch.
- gRPC uses
4317, HTTP uses4318.
- gRPC uses
- Wrong namespace-qualified service name.
- Use full in-cluster DNS name when needed.
- Network policy blocking egress.
- Allow traffic from the
collectorsnamespace to the collector service.
- Allow traffic from the
Verify fix
Section titled “Verify fix”outputs[*].errors stops increasing and outputs[*].lines_total resumes growing.
Scenario 4: Startup fails with config validation errors
Section titled “Scenario 4: Startup fails with config validation errors”Checks
Section titled “Checks”ff validate --config config.yamlExpected
Section titled “Expected” ready: defaultconfig ok: 1 pipeline(s)Exit code 0 on success, 1 on configuration error.
Common causes and fixes
Section titled “Common causes and fixes”- Missing
input.pathforfileinput. - Missing
endpointforotlp/http/elasticsearch/lokioutputs. - Missing the required
pipelinesmap or using the wrong nesting forinputs/outputs. - YAML scalar mistakes in SQL.
Use block scalar syntax for SQL:
transform: | SELECT level, message FROM logs WHERE level = 'ERROR'Verify fix
Section titled “Verify fix”Validation passes, then dry-run succeeds:
ff dry-run --config config.yamlScenario 5: Throughput drops or latency spikes
Section titled “Scenario 5: Throughput drops or latency spikes”Checks
Section titled “Checks”curl -s http://localhost:9090/admin/v1/status | jq '.pipelines[0].stage_seconds'Expected
Section titled “Expected”Stage times should be stable, with no sudden sustained growth in output time.
Common causes and fixes
Section titled “Common causes and fixes”- Output stage dominates.
- Move collector closer, enable compression (
compression: zstd), or scale collector.
- Move collector closer, enable compression (
- Excessive transform complexity.
- Simplify query or split into named pipelines.
- Node resource pressure.
- Increase CPU/memory requests for the
ffwdDaemonSet.
- Increase CPU/memory requests for the
Verify fix
Section titled “Verify fix”Observe reduced output stage time and stable forwarding counters.
Recovery fallback (safe mode)
Section titled “Recovery fallback (safe mode)”If you need immediate stabilization while investigating:
- Remove complex transform filters temporarily.
- Send to
stdoutin a non-production environment to confirm parse path. - Re-enable OTLP once counters and errors look healthy.
This narrows failure scope without changing input collection semantics.
Helpful diagnostics commands
Section titled “Helpful diagnostics commands”# Health and readinesscurl -s http://localhost:9090/live | jq .curl -s http://localhost:9090/ready | jq .
# End-to-end pipeline statscurl -s http://localhost:9090/admin/v1/status | jq .
# Flattened stats snapshotcurl -s http://localhost:9090/admin/v1/stats | jq .What’s next
Section titled “What’s next”| Topic | Where to go |
|---|---|
| Check pipeline metrics | Monitoring & Diagnostics |
| Review your config | YAML Reference |
| Understand the pipeline | Pipeline Explorer (interactive) |