Next-Gen Streaming Infrastructure

Summary

On 2026-04-07, the next-generation streaming infrastructure design was collaboratively authored by Michael Pentz and Claude, building on the v0.0.6 Global Relay Mesh Design from 2026-03-21. The design addresses three core IRL streaming pain points: high costs ($100/SIM x 4 carriers), unreliable connections, and complex setup. The long-term vision is “click live anywhere” with a phone, WiFi, and 2 SIMs producing reliable high-quality streams with minimal configuration.

The design is organized into 4 phases. Phase 1 (approved) covers immediate SRT parameter tuning on existing infrastructure and SRTLA receiver improvements. Phase 2 (approved, pending research) covers global server deployment across 8+ regions and a clean-room SRTLA receiver rewrite to eliminate AGPL licensing obligations. Phase 3 (approved, Android first) is a proprietary Android streaming app built on RootEncoder (Apache-2.0). Phase 4 is an ongoing technology watch covering RIST Main Profile, QUIC/MoQ, AV1 hardware encoding, and WiFi 7 MLO.

The implementation plan defines three parallel development tracks: Track 1 ships immediately via config changes (latency_min from 200ms to 1000ms on existing relay servers), Track 2 builds a proprietary Go SRT server using haivision/srtgo bindings to replace OpenIRL SLS entirely, and Track 3 is the clean-room SRTLA receiver rewrite. Track 2 implements the exact API contract from sls_client.go so the existing Go control plane (pool_provisioner.go) works unchanged. The implementation plan includes full code scaffolds for the stream registry, HTTP management API, SRT listener with socket options, and config structures.

Timeline

2026-03-21: v0.0.6 Global Relay Mesh Design completed. Foundation for next-gen infrastructure.
2026-04-07: Server audit performed on KC (Advin, 4vCPU/8GB/32TB/1Gbps) and LV (Frantech, 1GB/“unlimited”/1Gbps) relay servers. Found: both running default SRT params (lossmaxttl=0, latency_min=200, default oheadbw=25%), no stats_token configured, LV UFW hardened during audit.
2026-04-07: Next-gen streaming infrastructure design authored and approved (Phase 1). Phases 2-4 pending research.
2026-04-07: Implementation plan authored with three parallel tracks: config tuning (ships now), proprietary Go SRT server, and clean-room SRTLA receiver.
2026-04-07: Research completed on FEC, protocol landscape (SRT, RIST, QUIC/MoQ, WebRTC), AV1 codec status, mobile app landscape (IRL Pro, Moblin, RootEncoder), and AGPL licensing implications.

Current State

Production relay servers (KC + LV): Both run identical configs with default SRT parameters. The OpenIRL SLS already hardcodes lossmaxttl=40 on the SRTLA listener and applies SRTO_SRTLAPATCHES (disables dynamic reorder tolerance + periodic NAK for SRTLA connections). latency_min/latency_max are configurable at 200/5000 but oheadbw, peeridletimeo, and packetfilter/fec are not exposed in sls.conf and require an SLS fork patch or the proprietary replacement. Stats endpoint is unauthenticated (no --stats_token).

Glass-to-glass latency: Currently ~4-5 seconds (phone to relay to OBS to platform to viewer). Increasing latency_min from 200ms to 1000ms adds ~800ms buffer at the relay, pushing glass-to-glass to ~5-6 seconds. Trade-off: fewer stutters and drops, more consistent quality.

Signal chain: Camera/Phone (WiFi + SIM1 + SIM2 + USB Ethernet) to sender app (IRL Pro / Moblin / Larix) doing SRTLA bonding, to relay srtla_rec (:5000) reassembly, to SLS (:4001 ingest, :4000 player), to SRT to OBS, to RTMP/SRT to platform. Bonding happens on the sender, not the relay.

Proprietary server development: The telemy-srt-server project structure is scaffolded with Go module, stream registry (with tests), HTTP management API (matching sls_client.go contract: GET/POST/DELETE /api/stream-ids, GET /health, bearer auth), and SRT listener config. Uses haivision/srtgo for full socket option control. Not yet production-deployed.

SRTLA fork status: 7 modifications over upstream BELABOX: per-link statistics server (port 5080), stream ID extraction from SRT handshake TLV, ASN lookup via IPinfo Lite mmdb, force-disconnect endpoint, rate limiting + CIDR allowlist, Windows cross-platform build, static library extraction. Fork is AGPL-3.0.

Key Decisions

2026-04-07: SRT + SRTLA remains the primary protocol. Production proven, best mobile app ecosystem, mature tooling. RIST Main Profile and QUIC/MoQ are watch items only.
2026-04-07: No static FEC on cellular uplink. 30% constant bandwidth overhead wastes precious cellular uplink (on 8 Mbps: video capped at ~5.6 Mbps). BELABOX does not use FEC. Haivision recommends ARQ-only for internet/cellular.
2026-04-07: FEC on relay-to-OBS leg is optional, not mandatory. Useful for lossy last-mile (home internet, VPN, Starlink backhaul). Waste on clean LAN/studio connections. Per-connection toggle or auto-detect based on measured loss.
2026-04-07: Start latency_min=1000 (not 2000). Conservative approach to preserve glass-to-glass latency while testing quality improvement.
2026-04-07: Build proprietary Go SRT server (Track 2) instead of forking OpenIRL SLS. Clean path to owning the full stack, no throwaway fork.
2026-04-07: Clean-room SRTLA rewrite approved. AGPL prevents proprietary enhancements without source disclosure. Protocol is simple enough to rewrite (~1000 lines of C). MIT-licensed Rust sender (irlserver/srtla_send) available as reference.
2026-04-07: Global servers (direct relay deployment) over edge relay cascading. Simpler architecture, direct user connection, matches v0.0.6 mesh design.
2026-04-07: 2GB-4GB relay nodes are sufficient. srtla_rec + SLS are lightweight. RAM/CPU are not the bottleneck. Practical capacity: ~15-20 concurrent 1080p sessions at 8Mbps with retransmit headroom on a 1Gbps node.
2026-04-07: HEVC as recommended codec. 30-40% savings over H.264, universal support on flagships since 2018. AV1 not viable (only Pixel 8+ and Apple M5 Pro/Max have HW encode).
2026-04-07: Android app first (RootEncoder, Apache-2.0). No equivalent open-source Android app with SRTLA exists. iOS deferred (Moblin serves iOS users). Galaxy S20 class (Snapdragon 865) as minimum device target.

Gotchas & Known Issues

OpenIRL SLS does not expose oheadbw, peeridletimeo, or packetfilter in sls.conf. These require the proprietary Go SRT server (Track 2) or an SLS fork patch. Track 1 (config change) is limited to latency_min/latency_max adjustments.
srtgo requires libsrt installed. Development requires apt install libsrt-openssl-dev (Linux) or building from Haivision/srt source. The Docker relay image must include libsrt.
IRL Pro is built on the discontinued Larix SDK. Android-only for full features, no API, tied to IRLToolkit infrastructure. Not a viable long-term dependency.
Frantech “unlimited” bandwidth is unverified. The LV server has 1GB RAM and “unlimited” bandwidth at 1Gbps, but actual limits under sustained UDP load need stress testing.
AGPL compliance required until clean-room rewrite ships. Running the modified srtla-fork as a network service requires source disclosure. BELABOX addressed this by writing separate proprietary cloud software.
Compatibility test matrix required before production rollout. SRT parameter changes must be tested with real IRL Pro, BELABOX, and Moblin senders. No automated test infrastructure for this yet.
AV1 hardware encoding not viable for general use. Only Google Tensor G3+ (Pixel 8+) and Apple M5 Pro/Max have AV1 HW encode. No Qualcomm, Samsung, MediaTek, or mainstream Apple phone chips support it.
Operational requirements must be addressed before multi-region rollout: canary/rollback criteria, UDP kernel tuning (sysctl socket buffer sizes, conntrack), bandwidth overage alarms, DDoS posture for UDP services, per-stream abuse controls, relay-side encryption (SRT passphrase policy), Prometheus/SLOs, and node drain/rollback procedures.

Open Questions

VPS provider selection for global servers. Need updated cost analysis with UDP throughput testing per provider (OVHcloud, BuyVM/Frantech, Hetzner, Vultr).
SRTLA rewrite language: Rust (performance, matches srtla_send reference) vs Go (API codebase consistency, simpler concurrency model, slightly higher packet processing latency).
Mobile app revenue model: free with Telemy relay integration (drives subscriptions), free for Telemy users only, or fully free as adoption funnel.
SLS vs alternative SRT server. Is OpenIRL SLS the right baseline, or should Nimble Streamer or srt-live-transmit be evaluated?
RIST Main Profile timeline. When will IRL Pro or Moblin support Main Profile bonding? This would eliminate the SRTLA proxy layer entirely.
What is the status of Track 1 (config tuning) deployment? Has latency_min been changed to 1000 on either server?
What is the status of Track 2 (proprietary SRT server)? The scaffold exists in telemy-srt-server/ but production readiness is unknown.

Pentz Knowledge Base

Explorer

Graph View