Skip to content

On-Premises Deployment

Deploy ByteFreezer data processing on your infrastructure while using the managed control plane.

Overview

Location Components
Your Infrastructure Proxy, Receiver, Piper, Packer, Query, Connector
ByteFreezer Managed Control Plane, UI
  • Data sovereignty: Raw data stays in your environment
  • Compliance: Data never leaves your network
  • Central management: UI and configuration via managed control
  • Simplified operations: No need to manage control plane infrastructure

Architecture

┌─────────────────────────────────────────────────────────┐
│                  Your Infrastructure                     │
│                                                          │
│   ┌─────────┐     ┌──────────┐     ┌─────────┐          │
│   │  Proxy  │────▶│ Receiver │────▶│  Piper  │          │
│   └─────────┘     └──────────┘     └────┬────┘          │
│                                         │               │
│                        ┌────────────────┼───────────┐   │
│                        ▼                ▼           │   │
│                   ┌─────────┐     ┌─────────┐       │   │
│                   │ Packer  │     │  Query  │       │   │
│                   └────┬────┘     └────┬────┘       │   │
│                        │               │            │   │
│                        ▼               ▼            │   │
│                   ┌─────────────────────────────┐   │   │
│                   │     Your S3 Storage         │   │   │
│                   │     (MinIO, AWS S3, etc)    │   │   │
│                   └──────────────┬──────────────┘   │   │
│                                  │                  │   │
│                                  ▼                  │   │
│                        ┌──────────────────┐         │   │
│                        │    Connector     │──▶ Elasticsearch │
│                        │  (data export)   │──▶ Splunk        │
│                        └──────────────────┘──▶ Webhooks      │
│                                                     │   │
└─────────────────────────────────────────────────────┘   │
                          │                              │
                          │ API Key Authentication       │
                          ▼                              │
┌─────────────────────────────────────────────────────────┘
│           ByteFreezer Managed Control
│   ┌─────────────────────────────────────────────┐
│   │          Control Plane                       │
│   │   - Configuration management                 │
│   │   - Dataset definitions                      │
│   │   - Transformation rules                     │
│   │   - User management                          │
│   └─────────────────────────────────────────────┘
│   ┌─────────────────────────────────────────────┐
│   │          Web UI (bytefreezer.com)            │
│   └─────────────────────────────────────────────┘
└──────────────────────────────────────────────────────────

Prerequisites

  1. ByteFreezer Account: Sign up at bytefreezer.com
  2. Infrastructure: Linux servers or Kubernetes cluster
  3. S3 Storage: MinIO, AWS S3, or compatible object storage
  4. Network: Outbound HTTPS (443) to api.bytefreezer.com

Service Ports

Component API Port Data Port Notes
Proxy 8008 514, 6343, 4739, etc. Data ports depend on configured plugins
Receiver 8081 8080 (webhook) Proxy sends data to webhook port
Piper 8082 Reads from S3, writes to S3
Packer 8083 Reads from S3, writes Parquet to S3
Query 8000 Optional, SQL + AI queries
Connector 8090 Optional, data export to external systems

Step 1: Generate an API Key

API keys authenticate your on-prem services with the managed control plane.

  1. Log into bytefreezer.com
  2. Navigate to API Keys in the sidebar
  3. Click Generate New Key
  4. Select Service as the key type
  5. Enter a descriptive name (e.g., "production-datacenter-1")
  6. Copy the key immediately — it will only be shown once

Key Security

The service key provides full API access for your account. Treat it like a password:

  • Never commit to version control
  • Use environment variables or secrets management
  • Rotate keys periodically
  • Revoke immediately if compromised

Step 2: Deploy Using the Installer

The installer project provides ready-to-use deployment packages for all supported platforms.

Deployment Options

Platform Method Directory Best For
Kubernetes Helm charts helm/ Production clusters
Docker Docker Compose docker/ Single-host, dev/test
Bare metal Ansible ansible/ VMs, systemd services
AWS ECS Fargate ecs/ Serverless on AWS
GCP GKE / Cloud Run gcp/ Google Cloud
Azure AKS / Container Instances azure/ Microsoft Azure

Quick Start — Docker Compose

git clone https://github.com/bytefreezer/installer.git
cd installer/docker/bytefreezer

# Configure
cp .env.example .env
# Edit .env — set CONTROL_URL and CONTROL_API_KEY

# Start with bundled MinIO
docker compose --profile with-minio up -d

Quick Start — Kubernetes (Helm)

git clone https://github.com/bytefreezer/installer.git
cd installer/helm

# Deploy processing stack
helm install bytefreezer ./bytefreezer \
  --set minio.enabled=true \
  --set controlService.url=https://api.bytefreezer.com \
  --set controlService.apiKey=YOUR_API_KEY

# Deploy proxy (at edge)
helm install proxy ./proxy \
  --set receiver.url=http://bytefreezer-receiver:8080 \
  --set controlService.url=https://api.bytefreezer.com \
  --set controlService.apiKey=YOUR_API_KEY

Quick Start — Ansible (Bare Metal)

git clone https://github.com/bytefreezer/installer.git
cd installer/ansible/bytefreezer

cp inventory.yml.example inventory.yml
# Edit with your servers

cp vars/secrets.yml.example vars/secrets.yml
# Edit with API key
ansible-vault encrypt vars/secrets.yml

ansible-playbook -i inventory.yml playbook.yml --ask-vault-pass

AI-Assisted Deployment (MCP)

The ByteFreezer MCP server can generate deployment packages for any platform. Connect it to your AI assistant and use:

  • bf_generate_docker_compose — Docker Compose + config files
  • bf_generate_helm_values — Helm values.yaml
  • bf_generate_systemd — systemd install script
  • bf_generate_standalone — standalone shell script

Step 3: Verify Connectivity

Check that services registered with the control plane:

  1. Log into bytefreezer.com
  2. Navigate to Health in the sidebar
  3. All deployed services should show as Healthy

Or test via API:

curl -s -w "\nHTTP: %{http_code}\n" \
  -H "Authorization: Bearer $BYTEFREEZER_API_KEY" \
  https://api.bytefreezer.com/api/v1/health

Configuration Reference

Each service reads its configuration from a YAML file. The Docker images ship with defaults that work for Docker Compose + MinIO deployments — only secrets need to be provided. See Configuration for the full reference.

Below are the actual config key names for each service.

Proxy

app:
  name: "bytefreezer-proxy"

server:
  api_port: 8008

config_mode: "control-only"
account_id: "your-account-id"
bearer_token: "your-api-key"
control_url: "https://api.bytefreezer.com"

receiver:
  base_url: "http://your-receiver:8080"

config_polling:
  enabled: true
  interval_seconds: 60
  cache_directory: "/var/cache/bytefreezer-proxy"

batching:
  enabled: true
  max_bytes: 10485760
  timeout_seconds: 60
  compression_enabled: true

spooling:
  enabled: true
  directory: "/var/spool/bytefreezer-proxy"
  max_size_bytes: 1073741824

health_reporting:
  enabled: true
  report_interval: 30
  register_on_startup: true

error_tracking:
  enabled: true

Receiver

app:
  name: "bytefreezer-receiver"

server:
  api_port: 8081

protocols:
  webhook:
    enabled: true
    port: 8080
    max_payload_size: 10485760

bytefreezer:
  upload_worker_count: 10
  spool_path: "/var/spool/bytefreezer-receiver"

s3destination:
  bucket_name: "bytefreezer-intake"
  region: "us-east-1"
  endpoint: "minio:9000"
  ssl: false
  access_key: "your-s3-key"
  secret_key: "your-s3-secret"

control_service:
  enabled: true
  control_url: "https://api.bytefreezer.com"
  api_key: "your-service-key"
  account_id: "your-account-id"

dlq:
  enabled: true
  directory: "/var/spool/bytefreezer-receiver"
  retry_attempts: 3
  retry_interval_seconds: 60

housekeeping:
  enabled: true
  interval_seconds: 300

health_reporting:
  enabled: true
  report_interval: 30
  register_on_startup: true

error_tracking:
  enabled: true

Receiver requires control_service section

The control_service section is required for tenant ID validation. Without it, all POSTs return HTTP 410 "Tenant not found or inactive". The health_reporting section is only for health check reporting — it does NOT replace control_service.

Piper

app:
  name: "bytefreezer-piper"

server:
  api_port: 8082

s3_source:
  bucket_name: "bytefreezer-intake"
  region: "us-east-1"
  endpoint: "minio:9000"
  ssl: false
  access_key: "your-s3-key"
  secret_key: "your-s3-secret"
  poll_interval: "30s"

s3_destination:
  bucket_name: "bytefreezer-piper"
  region: "us-east-1"
  endpoint: "minio:9000"
  ssl: false
  access_key: "your-s3-key"
  secret_key: "your-s3-secret"

processing:
  max_concurrent_jobs: 10
  job_timeout_seconds: 600
  retry_attempts: 3

control_service:
  enabled: true
  control_url: "https://api.bytefreezer.com"
  api_key: "your-service-key"
  account_id: "your-account-id"

housekeeping:
  enabled: true
  interval_seconds: 300

health_reporting:
  enabled: true
  report_interval: 30
  register_on_startup: true

error_tracking:
  enabled: true

job_timeout_seconds is required

processing.job_timeout_seconds must be set to a non-zero value (recommended: 600). If zero or missing, all S3 and API operations during processing will fail immediately with "context deadline exceeded".

Packer

app:
  name: "bytefreezer-packer"

server:
  api_port: 8083

bytefreezer:
  spool_path: "/var/spool/bytefreezer-packer"
  cache_path: "/var/cache/bytefreezer-packer"

s3source:
  bucket_name: "bytefreezer-piper"
  region: "us-east-1"
  endpoint: "minio:9000"
  ssl: false
  access_key: "your-s3-key"
  secret_key: "your-s3-secret"

control_service:
  enabled: true
  control_url: "https://api.bytefreezer.com"
  api_key: "your-service-key"
  account_id: "your-account-id"

parquet:
  max_file_size_mb: 64
  timeout_seconds: 1200
  compression: "zstd"
  streaming_mode: true
  atomic_upload: true

housekeeping:
  enabled: true
  interval_seconds: 300
  cleanup:
    enabled: true

health_reporting:
  enabled: true
  report_interval: 30
  register_on_startup: true

error_tracking:
  enabled: true

Packer paths

bytefreezer.spool_path and bytefreezer.cache_path are required. In Docker, these must be writable volumes (uid 1000). The cache directory is used for intermediate processing files.

Query (Optional)

app:
  name: "bytefreezer-query"

server:
  port: 8000

control:
  url: "https://api.bytefreezer.com"
  api_key: "your-api-key"
  account_id: "your-account-id"

health_reporting:
  enabled: true
  report_interval: 30

# LLM for natural language queries (optional)
# Leave provider empty to disable NL queries — raw SQL always works
llm:
  provider: ""
  api_key: ""
  model: ""

limits:
  max_time_range_hours: 720
  max_row_limit: 10000
  allow_order_by: true

Connector (Optional)

The Connector exports subsets of your parquet data to external systems (Elasticsearch, Splunk, webhooks). See Connector for full documentation.

server:
  port: 8090

control:
  url: "https://api.bytefreezer.com"
  api_key: "your-service-key"
  account_id: "your-account-id"

health_reporting:
  enabled: true
  report_interval: 30

# Query config (required for batch/watch modes)
query:
  tenant_id: ""
  dataset_id: ""
  sql: "SELECT * FROM read_parquet('PARQUET_PATH', hive_partitioning=true, union_by_name=true) LIMIT 100"

destination:
  type: stdout            # stdout, elasticsearch, webhook
  config: {}

schedule:
  interval_seconds: 60    # Watch mode poll interval
  batch_size: 1000

S3 Credentials

S3 access keys and the control API key are provided via environment variables, .env files, or Kubernetes Secrets depending on your deployment method. The installer handles this for you. See the installer project for details.

Managing API Keys

Viewing Keys

In the UI, navigate to API Keys to see all keys. Service keys show:

  • Key name
  • Key prefix (first 8 characters for identification)
  • Creation date
  • Last used timestamp

Revoking Keys

To revoke a compromised or unused key:

  1. Navigate to API Keys
  2. Find the service key in the list
  3. Click Revoke
  4. Confirm revocation

Immediate Effect

Revoked keys stop working immediately. All services using that key will lose access to the control plane.

Key Rotation

  1. Generate a new service key
  2. Update service configurations (or .env / Secrets)
  3. Restart services to pick up the new key
  4. Verify connectivity via the Health page
  5. Revoke the old key

Troubleshooting

"401 Unauthorized" errors

  • Verify API key is correct and not revoked
  • Check key is passed in Authorization: Bearer <key> header
  • Ensure the API key environment variable is set

"Connection refused" to control plane

  • Verify outbound HTTPS (443) is allowed to api.bytefreezer.com
  • Check for proxy/firewall blocking the connection
  • Test with: curl -v https://api.bytefreezer.com/api/v1/health

Services not processing data

  • Check logs for authentication errors
  • Verify S3 credentials and bucket access
  • Confirm control plane connectivity from each service
  • Check the Health page in the UI for status details
  • Verify processing.job_timeout_seconds is set (piper)
  • Verify control_service section exists (receiver)

Packer not producing Parquet files

  • Verify bytefreezer.spool_path and bytefreezer.cache_path are set and writable
  • Check housekeeping.enabled: true
  • Verify S3 source bucket has data from piper
  • Packer started before tenant creation may need a restart

Next Steps