Skip to content

Inputs

Sources produce raw bytes or Arrow batches independently and feed into a bounded pipeline channel. FastForward supports 9 input types across two modes:

  • Raw bytes — File, stdin, TCP, UDP, HTTP. The scanner parses these into Arrow RecordBatches.
  • Arrow-native — OTLP, Arrow IPC, Platform Sensor, Generator. These produce structured RecordBatches directly, bypassing the scanner entirely.

Tail files matching glob patterns with position tracking across restarts. Detects rotation via inode/device identity, drains old files before switching.

Key properties: kqueue/inotify + polling hybrid. Adaptive fast-poll on budget hit. LRU eviction at 1024 open files. Checkpoint via atomic JSON write + fsync.

Internal stages: Glob Discovery → File Watcher → Read Loop → Budget Control → EOF Detection

See Tailing for a detailed look at rotation, truncation, and crash recovery.

Read from standard input until EOF, then drain the pipeline and exit.

Key properties: finite command input. Bounded reader channel. No checkpointing, glob discovery, or rotation handling.

Internal stages: Stdin Reader → Bounded Channel → EOF Detection → Drain

Accept log lines on a TCP socket with automatic framing detection. Supports RFC 6587 octet counting and newline-delimited fallback.

Key properties: Max 1024 concurrent clients. Global 256 MiB memory budget. Idle timeout 60s. Oversized lines (>1 MiB) discarded. Optional TLS/mTLS.

Internal stages: Accept → Keepalive → Read (512KB/client) → Frame Detect → Line Extract

Receive log lines as individual datagrams. Kernel buffer tuned to 8 MiB. Drop detection via ENOBUFS/ENOMEM.

Key properties: Max 256 datagrams per poll, 1 MiB emit budget. Fire-and-forget: no ack, no ordering guarantee.

Internal stages: Socket Bind → Buffer Tune (8 MiB) → Drain Loop → Line Normalize

Receive OpenTelemetry log records over HTTP. Decodes protobuf and JSON payloads directly into Arrow RecordBatches — bypasses the scanner.

Key properties: Resource attributes flattened with configurable prefix. Bounded channel to pipeline. Produces RecordBatch directly, not raw bytes.

Internal stages: HTTP Server → Proto/JSON Decode → Resource Flatten → Arrow Convert

Use FastForward as an OTLP relay or aggregator. Because the input produces structured Arrow data directly, there is no parsing overhead — just decode the protobuf envelope and convert. Resource attributes, trace context, and severity are all preserved.

Receive newline-delimited payloads via HTTP POST. Configurable path, method, body size limit, and response code.

Key properties: Background thread with bounded channel. Max 10 MiB body. Supports gzip Content-Encoding.

Internal stages: Axum Server → Route Match → Body Read → Gzip Decompress → Channel Send

Receive Arrow IPC stream payloads over HTTP POST. Bypasses the scanner — decoded RecordBatches go directly into the pipeline.

Key properties: Scanner bypass: zero parsing overhead. Canonical content type: application/vnd.apache.arrow.stream.

Internal stages: HTTP POST /v1/arrow → Zstd Decompress → IPC Stream Decode → RecordBatch

eBPF (Linux), EndpointSecurity (macOS), or ETW (Windows) sensor. Arrow-native control and signal rows for process, file, network, DNS events.

Key properties: Arrow-native: no format/scanner. Control-plane reload via JSON file. Configurable signal families.

Internal stages: Kernel Attach → Ring Buffer Poll → Signal Decode → Arrow Emit

Emit synthetic JSON log lines for benchmarking and pipeline testing. Configurable EPS, batch size, and timestamp generation.

Key properties: 7 built-in profiles: logs (synthetic requests), record (flat JSON from attributes), envoy (proxy access logs), cri_k8s (CRI Kubernetes logs), wide (20+ field structured logs), narrow (minimal 5-field logs), and cloud_trail (AWS audit events). Zero external dependencies.

Internal stages: Profile Select → Row Generate → JSON Serialize → Batch Emit