Skip to content

Sources

ByteFreezer accepts data from a wide variety of sources. The Proxy component handles data collection and forwarding.

Supported Protocols

Network Protocols

Protocol Port Description
UDP Configurable High-throughput, connectionless
TCP Configurable Reliable, connection-oriented
Syslog 514 (UDP/TCP) Standard logging protocol (RFC 5424)
sFlow 6343 (UDP) Network traffic sampling
IPFIX 4739 (UDP/TCP) IP Flow Information Export

HTTP/Webhook

Method Endpoint Description
POST /webhook/{tenant}/{dataset} Direct HTTP ingestion
POST /v1/events Batch event submission

Message Queues

Queue Description
SQS AWS Simple Queue Service
Kafka Apache Kafka topics
NATS NATS messaging
Kinesis AWS Kinesis streams

Configuration

UDP/TCP Sources

sources:
  - type: udp
    port: 5514
    format: syslog

  - type: tcp
    port: 5515
    format: json
    tls: true

sFlow/IPFIX Sources

sources:
  - type: sflow
    port: 6343

  - type: ipfix
    port: 4739

Message Queue Sources

sources:
  - type: kafka
    brokers:
      - kafka1:9092
      - kafka2:9092
    topic: security-events
    group_id: bytefreezer

  - type: sqs
    queue_url: https://sqs.us-east-1.amazonaws.com/123456789/events
    region: us-east-1

Webhook Endpoint

For HTTP-based ingestion, send events directly to the Receiver:

# Single event
curl -X POST https://receiver.bytefreezer.com/webhook/{tenant}/{dataset} \
  -H "Content-Type: application/json" \
  -d '{"timestamp": "2024-01-15T10:30:00Z", "level": "info", "message": "Event"}'

# Batch events (JSON Lines)
curl -X POST https://receiver.bytefreezer.com/webhook/{tenant}/{dataset} \
  -H "Content-Type: application/x-ndjson" \
  -d '{"timestamp": "2024-01-15T10:30:00Z", "event": "login"}
{"timestamp": "2024-01-15T10:30:01Z", "event": "logout"}'

Data Formats

ByteFreezer accepts multiple data formats:

Format Content-Type Description
JSON application/json Single JSON object
JSON Lines application/x-ndjson One JSON object per line
Syslog N/A (network) RFC 5424 structured data
Raw text/plain Plain text (parsed downstream)

Best Practices

High-Throughput Sources

For high-volume sources like sFlow or IPFIX:

  1. Deploy Proxy close to source - Minimize network latency
  2. Use UDP where possible - Lower overhead than TCP
  3. Enable batching - Group events before forwarding
  4. Consider sampling - Use Piper to sample if volume is too high

Reliable Sources

For critical data that must not be lost:

  1. Use TCP or HTTP - Ensures delivery confirmation
  2. Enable TLS - Encrypt in transit
  3. Use message queues - SQS/Kafka provide persistence

Multi-Source Environments

When collecting from multiple sources:

  1. Use separate tenants - One tenant per source type
  2. Tag at collection - Add source metadata in Proxy
  3. Consistent schemas - Normalize data with Piper transformations