API Architecture
Summary
The GoLiveBro (formerly Telemy) control plane is a Go API server that manages the full lifecycle of IRL streaming relay sessions, user authentication, billing, and operational telemetry. The API runs as a single process (cmd/api/main.go) that handles both HTTP requests and background jobs as goroutines, with an optional standalone jobs binary (cmd/jobs/main.go) for future horizontal scaling. The system sits behind Cloudflare (TLS termination + proxy) and communicates with a PostgreSQL 16 database, a pool of always-on relay VPS nodes, and external services (LemonSqueezy for billing, Cloudflare DNS for relay hostname management).
The v0.0.5 architecture replaced per-user ephemeral EC2 instances with a shared relay pool model (PoolProvisioner), deployed on 2026-03-23. Relay sessions now provision in under 1 second from always-on Advin VPS nodes rather than waiting for EC2 boot. The OBS plugin (single C++ DLL) authenticates via a browser-handoff login flow, receives short-lived JWTs for control-plane calls, and stores credentials in a DPAPI-encrypted vault. Relay entitlement is enforced server-side based on plan tier, with LemonSqueezy webhook events maintaining a local mirror of subscription state.
The API contract is versioned at /api/v1 and follows HTTPS-only transport (TLS 1.2+). The system supports 11 core endpoints spanning auth (session, login, refresh, logout), relay lifecycle (start, stop, active, manifest, health), billing (checkout, webhook), usage metering, and a public capacity status endpoint. Idempotency is enforced on relay start via UUIDv4 Idempotency-Key headers with 1-hour retention and async replay semantics.
Timeline
- 2024-01-01: Project started as Telemyapp
- v0.0.3: Multi-process Rust/C++ hybrid architecture with IPC pipes
- v0.0.4: Replaced with all-native C++ single-DLL OBS plugin. Per-link relay telemetry (direct HTTP polling from C++ plugin to relay stats endpoint) and per-output multi-encode telemetry implemented.
- v0.0.5: Browser-handoff plugin login flow, session management, entitlement-gated relay start, billing integration (LemonSqueezy Phase 5a), PoolProvisioner (shared relay pool), stream slot system, BYOR support, always-ready relay model.
- 2026-03-20: API migration plan from EC2 to Advin VPS created.
- 2026-03-22: AWS references retired from documentation, pool relay model became canonical.
- 2026-03-23: Always-ready relay model deployed (AR-0 through AR-3). AWS provisioner deprecated.
- 2026-04-02: Repo split design approved (public OBS plugin MIT + private backend).
Current State
The API server runs on an Advin VPS (208.84.101.84) behind Cloudflare proxy at api.telemyapp.com (migrating to api.golivebro.com). PostgreSQL 16 runs in Docker on the same host. The single cmd/api binary handles HTTP requests on port 8080 and all background jobs (health checks at 60s intervals, grace enforcement at 5min intervals, subscription reconciliation at 1hr intervals, idempotency TTL cleanup, session usage rollup, outage reconciliation).
The relay stack uses always-on VPS nodes registered in the relay_pool table. Each node runs a dual-process Docker Compose stack: srtla_rec (UDP port 5000, bonded SRTLA proxy) and SLS v1.5.0 (SRT Live Server, ingest on port 4001, player on port 4000). Additional ports: SLS management API (8090), per-link stats (5080), SLS web UI (3000). The PoolProvisioner assigns sessions atomically using FOR UPDATE SKIP LOCKED on the least-loaded healthy server, registers stream IDs via the SLS management API, and returns session details in under 1 second.
Authentication uses a browser-handoff pattern: plugin calls POST /auth/plugin/login/start, opens browser URL, polls until web tier completes the flow. JWTs carry uid and sid claims. Refresh tokens are hashed server-side in auth_sessions. The billing system receives LemonSqueezy webhook events, verifies HMAC-SHA256 signatures, stores events in billing_events, and routes actions (activate, downgrade, payment recovery, grace period management) to update user plan state.
The database has 11 tables: users, auth_sessions, api_keys, plugin_login_attempts, relay_instances (legacy AWS), sessions, relay_pool, relay_assignments, user_stream_slots, billing_events, and idempotency_records. Primary keys use text IDs with prefixes (usr_, ses_, evt_, etc.). Plan tiers are free, standard, internal. Plan statuses are active, past_due, canceled, trial.
Prometheus metrics are exposed at GET /metrics with telemy_ prefix (migrating to glb_). Key metrics include relay provision/deprovision totals and latency histograms, job run totals and duration histograms, and legacy AWS operation metrics (deprecated but retained).
Key Decisions
- v0.0.4: Replaced multi-process Rust/C++ hybrid with single-DLL all-native C++ plugin. Eliminated IPC layer entirely.
- v0.0.5: Chose browser-handoff login flow over embedded browser or manual key entry. Stores only backend-issued auth material in plugin vault.
- v0.0.5: Server-side entitlement enforcement on relay start. Dock UI disables controls as UX convenience but backend is authoritative.
- 2026-03-23: Replaced per-user ephemeral EC2 instances with shared relay pool on always-on VPS nodes. Eliminated AWS boot delay. Provision time reduced from minutes to under 1 second.
- 2026-03-23: Adopted always-ready model: relays auto-provision when a managed connection is added and auto-deprovision when removed. No user-initiated connect/disconnect buttons.
- v0.0.5 Phase 5a: Chose LemonSqueezy as Merchant of Record for billing. HMAC-SHA256 webhook verification. 48-hour grace period on payment failure before downgrade to free tier.
- 2026-04-02: Approved hybrid open-source model. OBS plugin (MIT) goes public; backend stays private. Rationale: builds trust for native DLL, keeps auth/billing/provisioning as competitive moat.
Gotchas & Known Issues
request_idand structureddetailsin error responses are not currently populated. The Go server returns onlyerror.codeanderror.message.cmd/jobsdoes not run an HTTP listener, so there is no separate metrics scrape endpoint for background jobs. Job metrics are only visible from the process hosting the metrics handler (i.e.,cmd/api).- Relay health endpoint (
POST /api/v1/relay/health) usesX-Relay-Authshared secret, not JWT. Must include bothsession_idandinstance_idwith backend validation. - TCP port 5080 on relay security groups must be accessible from OBS machines for per-link stats. Port 8090 must be accessible for SLS management API.
- Stats endpoint on port 5080 has no authentication (internal relay port).
- The
relay_instancestable and associated AWS EIP columns onusers(eip_allocation_id,eip_public_ip) are deprecated legacy from the AWS provisioner era but retained for archive reference. - Stripe env vars (
GLB_STRIPE_*) not yet configured on production. Billing webhooks return “not configured” until set. Migration from LemonSqueezy to Stripe is in progress. - Dock bundles still reference telemyapp paths. Blocked until OBS plugin repo is migrated.
- Session state machine (STATE_MACHINE_v1.md) governs scene switching (Phase 5b, not yet implemented). Relay lifecycle is governed by the always-ready model, not this state machine.
- BYOR relays may not run the same SLS + srtla-receiver stack. Plugin handles missing stats endpoints gracefully with “Stats unavailable” fallback.
- The
subscription_cancelledLemonSqueezy event is a no-op. Access continues until cycle end;subscription_expiredtriggers the actual downgrade.
Open Questions
- Scene switching state machine (Phase 5b) is specified but not implemented. Modes (STUDIO, IRL_CONNECTING, IRL_ACTIVE, IRL_GRACE, DEGRADED, FATAL) and scene intent logic (LIVE, BRB, OFFLINE, HOLD) are designed but not yet in the codebase.
- Standalone
cmd/jobsHTTP metrics listener not implemented. Future setup would scrape port 8081 for jobs-only metrics. - Output rename/group/visibility settings UI for multi-encode is planned but not implemented.
- Auto-detect encoder group from resolution is planned but not implemented.
reconcile_subscriptionsjob is currently a stub pending production products on the new billing provider.- Repo split (Task 2 history rewrite + Task 7 force push) requires GitLab branch unprotection as a manual step.
- Env var prefix migration from
TELEMY_toGLB_and metrics prefix fromtelemy_toglb_are in progress.