Skip to content

Receiver

The Receiver is an HTTP webhook endpoint that accepts compressed data batches from Proxy and stores them as raw files in S3-compatible object storage.

What It Does

  1. Accepts HTTP POST requests from Proxy instances
  2. Validates authentication and routing metadata (account, tenant, dataset)
  3. Spools data through a 3-stage durability pipeline
  4. Writes raw files to S3/MinIO

The Receiver does not parse, transform, or inspect event content. It stores data as-is for downstream processing by Piper and Packer.

3-Stage Spooling

Incoming HTTP POST
┌─────────────┐
│   Memory    │  Fast acknowledgment to Proxy
└──────┬──────┘
┌─────────────┐
│    Disk     │  Persisted locally for crash recovery
└──────┬──────┘
┌─────────────┐
│     S3      │  Final durable storage
└─────────────┘

Data is acknowledged to Proxy after the memory stage, ensuring low-latency ingestion. Disk and S3 writes happen asynchronously. If the process crashes, disk-spooled data is recovered on restart.

High Availability

Run multiple Receiver instances behind a load balancer. Each instance stores independently to S3 — there is no shared state or coordination between instances.

Proxy ──┬──▶ Receiver-1 ──▶ S3
        ├──▶ Receiver-2 ──▶ S3
        └──▶ Receiver-N ──▶ S3

Use any HTTP load balancer (NGINX, HAProxy, AWS ALB, Kubernetes Service). Proxy handles retries and failover automatically via its spooling mechanism.

Ports

The Receiver runs two listeners:

Port Purpose
8080 Webhook — accepts HTTP POST data from Proxy
8081 API — health checks, DLQ management, config

Configuration

app:
  name: "bytefreezer-receiver"

server:
  api_port: 8081                           # API / health endpoint

webhook:
  port: 8080                               # Data ingestion from Proxy
  read_timeout_seconds: 30
  write_timeout_seconds: 30

s3:
  bucket_name: "bytefreezer-intake"
  region: "us-east-1"
  endpoint: "minio:9000"                   # or s3.amazonaws.com
  ssl: false
  use_iam_role: false

control_service:
  enabled: true
  control_url: "https://api.bytefreezer.com"
  timeout_seconds: 30

spooling:
  enabled: true
  directory: "/var/spool/bytefreezer-receiver"
  max_size_bytes: 1073741824               # 1 GB disk spool limit
  retry_attempts: 3
  retry_interval_seconds: 60

health_reporting:
  enabled: true
  report_interval: 30
  register_on_startup: true

S3 credentials and the control API key are provided via the installer's .env file or Kubernetes Secrets. See the installer project for deployment templates.

S3 Path Layout

Raw files are stored with the following key structure:

s3://bucket/raw/{account_id}/{tenant_id}/{dataset_id}/{timestamp}_{batch_id}.gz

Piper and Packer read from this path to process and convert data downstream.