Next-Gen Streaming Infrastructure

Summary

On 2026-04-07, the next-generation streaming infrastructure design was collaboratively authored by Michael Pentz and Claude, building on the v0.0.6 Global Relay Mesh Design from 2026-03-21. The design addresses three core IRL streaming pain points: high costs ($100/SIM x 4 carriers), unreliable connections, and complex setup. The long-term vision is “click live anywhere” with a phone, WiFi, and 2 SIMs producing reliable high-quality streams with minimal configuration.

The design is organized into 4 phases. Phase 1 (approved) covers immediate SRT parameter tuning on existing infrastructure and SRTLA receiver improvements. Phase 2 (approved, pending research) covers global server deployment across 8+ regions and a clean-room SRTLA receiver rewrite to eliminate AGPL licensing obligations. Phase 3 (approved, Android first) is a proprietary Android streaming app built on RootEncoder (Apache-2.0). Phase 4 is an ongoing technology watch covering RIST Main Profile, QUIC/MoQ, AV1 hardware encoding, and WiFi 7 MLO.

The implementation plan defines three parallel development tracks: Track 1 ships immediately via config changes (latency_min from 200ms to 1000ms on existing relay servers), Track 2 builds a proprietary Go SRT server using haivision/srtgo bindings to replace OpenIRL SLS entirely, and Track 3 is the clean-room SRTLA receiver rewrite. Track 2 implements the exact API contract from sls_client.go so the existing Go control plane (pool_provisioner.go) works unchanged. The implementation plan includes full code scaffolds for the stream registry, HTTP management API, SRT listener with socket options, and config structures.

Timeline

  • 2026-03-21: v0.0.6 Global Relay Mesh Design completed. Foundation for next-gen infrastructure.
  • 2026-04-07: Server audit performed on KC (Advin, 4vCPU/8GB/32TB/1Gbps) and LV (Frantech, 1GB/“unlimited”/1Gbps) relay servers. Found: both running default SRT params (lossmaxttl=0, latency_min=200, default oheadbw=25%), no stats_token configured, LV UFW hardened during audit.
  • 2026-04-07: Next-gen streaming infrastructure design authored and approved (Phase 1). Phases 2-4 pending research.
  • 2026-04-07: Implementation plan authored with three parallel tracks: config tuning (ships now), proprietary Go SRT server, and clean-room SRTLA receiver.
  • 2026-04-07: Research completed on FEC, protocol landscape (SRT, RIST, QUIC/MoQ, WebRTC), AV1 codec status, mobile app landscape (IRL Pro, Moblin, RootEncoder), and AGPL licensing implications.

Current State

Production relay servers (KC + LV): Both run identical configs with default SRT parameters. The OpenIRL SLS already hardcodes lossmaxttl=40 on the SRTLA listener and applies SRTO_SRTLAPATCHES (disables dynamic reorder tolerance + periodic NAK for SRTLA connections). latency_min/latency_max are configurable at 200/5000 but oheadbw, peeridletimeo, and packetfilter/fec are not exposed in sls.conf and require an SLS fork patch or the proprietary replacement. Stats endpoint is unauthenticated (no --stats_token).

Glass-to-glass latency: Currently ~4-5 seconds (phone to relay to OBS to platform to viewer). Increasing latency_min from 200ms to 1000ms adds ~800ms buffer at the relay, pushing glass-to-glass to ~5-6 seconds. Trade-off: fewer stutters and drops, more consistent quality.

Signal chain: Camera/Phone (WiFi + SIM1 + SIM2 + USB Ethernet) to sender app (IRL Pro / Moblin / Larix) doing SRTLA bonding, to relay srtla_rec (:5000) reassembly, to SLS (:4001 ingest, :4000 player), to SRT to OBS, to RTMP/SRT to platform. Bonding happens on the sender, not the relay.

Proprietary server development: The telemy-srt-server project structure is scaffolded with Go module, stream registry (with tests), HTTP management API (matching sls_client.go contract: GET/POST/DELETE /api/stream-ids, GET /health, bearer auth), and SRT listener config. Uses haivision/srtgo for full socket option control. Not yet production-deployed.

SRTLA fork status: 7 modifications over upstream BELABOX: per-link statistics server (port 5080), stream ID extraction from SRT handshake TLV, ASN lookup via IPinfo Lite mmdb, force-disconnect endpoint, rate limiting + CIDR allowlist, Windows cross-platform build, static library extraction. Fork is AGPL-3.0.

Key Decisions

  • 2026-04-07: SRT + SRTLA remains the primary protocol. Production proven, best mobile app ecosystem, mature tooling. RIST Main Profile and QUIC/MoQ are watch items only.
  • 2026-04-07: No static FEC on cellular uplink. 30% constant bandwidth overhead wastes precious cellular uplink (on 8 Mbps: video capped at ~5.6 Mbps). BELABOX does not use FEC. Haivision recommends ARQ-only for internet/cellular.
  • 2026-04-07: FEC on relay-to-OBS leg is optional, not mandatory. Useful for lossy last-mile (home internet, VPN, Starlink backhaul). Waste on clean LAN/studio connections. Per-connection toggle or auto-detect based on measured loss.
  • 2026-04-07: Start latency_min=1000 (not 2000). Conservative approach to preserve glass-to-glass latency while testing quality improvement.
  • 2026-04-07: Build proprietary Go SRT server (Track 2) instead of forking OpenIRL SLS. Clean path to owning the full stack, no throwaway fork.
  • 2026-04-07: Clean-room SRTLA rewrite approved. AGPL prevents proprietary enhancements without source disclosure. Protocol is simple enough to rewrite (~1000 lines of C). MIT-licensed Rust sender (irlserver/srtla_send) available as reference.
  • 2026-04-07: Global servers (direct relay deployment) over edge relay cascading. Simpler architecture, direct user connection, matches v0.0.6 mesh design.
  • 2026-04-07: 2GB-4GB relay nodes are sufficient. srtla_rec + SLS are lightweight. RAM/CPU are not the bottleneck. Practical capacity: ~15-20 concurrent 1080p sessions at 8Mbps with retransmit headroom on a 1Gbps node.
  • 2026-04-07: HEVC as recommended codec. 30-40% savings over H.264, universal support on flagships since 2018. AV1 not viable (only Pixel 8+ and Apple M5 Pro/Max have HW encode).
  • 2026-04-07: Android app first (RootEncoder, Apache-2.0). No equivalent open-source Android app with SRTLA exists. iOS deferred (Moblin serves iOS users). Galaxy S20 class (Snapdragon 865) as minimum device target.

Gotchas & Known Issues

  • OpenIRL SLS does not expose oheadbw, peeridletimeo, or packetfilter in sls.conf. These require the proprietary Go SRT server (Track 2) or an SLS fork patch. Track 1 (config change) is limited to latency_min/latency_max adjustments.
  • srtgo requires libsrt installed. Development requires apt install libsrt-openssl-dev (Linux) or building from Haivision/srt source. The Docker relay image must include libsrt.
  • IRL Pro is built on the discontinued Larix SDK. Android-only for full features, no API, tied to IRLToolkit infrastructure. Not a viable long-term dependency.
  • Frantech “unlimited” bandwidth is unverified. The LV server has 1GB RAM and “unlimited” bandwidth at 1Gbps, but actual limits under sustained UDP load need stress testing.
  • AGPL compliance required until clean-room rewrite ships. Running the modified srtla-fork as a network service requires source disclosure. BELABOX addressed this by writing separate proprietary cloud software.
  • Compatibility test matrix required before production rollout. SRT parameter changes must be tested with real IRL Pro, BELABOX, and Moblin senders. No automated test infrastructure for this yet.
  • AV1 hardware encoding not viable for general use. Only Google Tensor G3+ (Pixel 8+) and Apple M5 Pro/Max have AV1 HW encode. No Qualcomm, Samsung, MediaTek, or mainstream Apple phone chips support it.
  • Operational requirements must be addressed before multi-region rollout: canary/rollback criteria, UDP kernel tuning (sysctl socket buffer sizes, conntrack), bandwidth overage alarms, DDoS posture for UDP services, per-stream abuse controls, relay-side encryption (SRT passphrase policy), Prometheus/SLOs, and node drain/rollback procedures.

Open Questions

  • VPS provider selection for global servers. Need updated cost analysis with UDP throughput testing per provider (OVHcloud, BuyVM/Frantech, Hetzner, Vultr).
  • SRTLA rewrite language: Rust (performance, matches srtla_send reference) vs Go (API codebase consistency, simpler concurrency model, slightly higher packet processing latency).
  • Mobile app revenue model: free with Telemy relay integration (drives subscriptions), free for Telemy users only, or fully free as adoption funnel.
  • SLS vs alternative SRT server. Is OpenIRL SLS the right baseline, or should Nimble Streamer or srt-live-transmit be evaluated?
  • RIST Main Profile timeline. When will IRL Pro or Moblin support Main Profile bonding? This would eliminate the SRTLA proxy layer entirely.
  • What is the status of Track 1 (config tuning) deployment? Has latency_min been changed to 1000 on either server?
  • What is the status of Track 2 (proprietary SRT server)? The scaffold exists in telemy-srt-server/ but production readiness is unknown.

Sources