Project History

Summary

Telemy has evolved through three major versions since early 2026. v0.0.3 (February 2026) established the foundational architecture: a Go control plane on AWS EC2, a Rust local telemetry bridge (obs-telemetry-bridge), and a C++ OBS plugin shim communicating over named-pipe IPC with MessagePack serialization. This version validated the full relay lifecycle (provision, activate, deprovision) on AWS, introduced the Aegis dock UX prototype, and resolved 15 code review findings including a critical idempotency key format mismatch (Rust generated telemy-{ts}-{random}, Go expected UUID v4) and IPC pipe NULL DACL security gaps. The relay pipeline was fully validated end-to-end on 2026-03-02, and a deep security audit on 2026-02-22 drove credential rotation, DB role hardening, and compensating cleanup for stuck provisioning sessions.

v0.0.4 (March 2-8, 2026) was a major architectural pivot: the entire stack consolidated into a single native C++ OBS plugin DLL (aegis-obs-plugin.dll), eliminating the Rust bridge process, named-pipe IPC layer, and all Rust/Cargo dependencies. This version introduced native Win32/NVML metrics collection, DPAPI-encrypted config vault, WinHTTP relay client, and Qt palette theme sync. A 13-finding security audit on 2026-03-08 resulted in 12 fixes and 1 acknowledged-by-design item (cleartext relay telemetry on private VPC endpoints). The Elastic IP feature was also designed, implemented, and deployed in this version to solve mobile DNS caching issues with IRL Pro on iOS.

v0.0.5 (March 2026 onward) continues the all-native C++ architecture and introduces the Connection List as the primary UI model — replacing the single-relay paradigm with independently managed BYOR and Telemy-managed connections. Key decisions include zero-friction BYOR (no account required), DPAPI-encrypted per-connection secrets, and a ConnectionManager replacing the RelayClient singleton. A comprehensive refactor audit pipeline was designed on 2026-03-10 using Codex 5.4 and Claude Opus, with results published to Confluence, Jira, and Slack. The Always-Ready Relay feature was deployed by 2026-03-23, and the AWS relay model was later retired in favor of a pool relay model on Advin VPS.

Timeline

  • 2026-02-22: Deep security audit of v0.0.3 codebase — 12 findings across Go control plane, Rust bridge, and docs. Compensating cleanup for relay/start failures implemented. EC2 credentials rotated, dedicated DB role created. Switched from AEGIS_RELAY_PROVIDER=fake to aws. Live AWS relay provisioning validated (instance i-053fa5dd3778334d0).
  • 2026-02-22: Post-audit execution — timeout mitigation for relay/start deployed (chi middleware + server WriteTimeout raised to 3m). Access hardening: SSM agent stabilized, SSH key-based access configured, IAM recovery permissions expanded.
  • 2026-02-23: IPC v1 foundations validated — Rust named-pipe server with MessagePack envelopes, C++ shim harness connected end-to-end. First plugin-to-core IPC validation on Windows. Aegis dock UX reference prototype (aegis-dock.jsx) created. Bridge host/reducer (aegis-dock-bridge.js) implemented with IPC envelope reducers.
  • 2026-02-24: Real OBS 32.0.4 plugin build validated — callback-mode scene switching confirmed for positive path (success), negative paths (scene_not_found, missing_scene_name). Scaffold-mode dock host validated in real OBS with Rust core IPC. Qt/WebEngine dock host path identified as blocked by runtime compatibility issues.
  • 2026-02-24: OBS/CEF dock host runtime brought up — browser dock successfully loaded in OBS, JS injection pipeline validated, shutdown stabilization completed.
  • 2026-02-26: Dock runtime regression recovery — scenes, toggle, theme, and title regressions fixed after integration changes.
  • 2026-02-27: Dock action status lifecycle and mode/setting mapping sync completed.
  • 2026-02-28: Full code review of telemy-v0.0.3 — 3 critical, 7 important, 5 minor findings. Critical: IPC pipe NULL DACL, token logged plaintext, non-Windows vault stores plaintext. Per-link relay telemetry spec and multi-encode/multi-upload spec added to API_SPEC_v1.md.
  • 2026-03-01: Three dock UX fixes committed — deferred dock show (QTimer pattern), synthetic theme replay, scene prefs persistence via dock_scene_prefs.json. Bridge bootstrap fix: receiveDockActionResultJson added to complete the C++ result delivery path. Gemini relay operating contract established.
  • 2026-03-02: Relay pipeline fully validated end-to-end. Bridge sendAction() now accepts relay_start/relay_stop. All 15 code review findings resolved across 3 commits. Backend hardening: UUID v4 idempotency keys, MutexExt trait replacing 36 lock().unwrap() calls, delta-based bitrate, server module split (1810 to 418 lines in mod.rs + 3 sub-modules).
  • 2026-03-02: v0.0.4 architectural pivot begins — all-native C++ OBS plugin DLL replaces Rust bridge + IPC layer.
  • 2026-03-05: E2E relay telemetry validated live — IRL Pro (bonded WiFi + T-Mobile) to AWS relay to C++ plugin to React dock. Stable 4.6 Mbps aggregate, 53ms RTT, 1000ms latency.
  • 2026-03-08: v0.0.4 security audit completed — 13 findings, 12 fixed, 1 acknowledged. Key fixes: dock JS can no longer read native secrets, detached relay threads join before g_relay.reset(), TLS decision uses scheme not port, relay health bound to instance_id. Elastic IP feature deployed — IRL Pro connects instantly on subsequent provisions. v0.0.3-era docs archived out of the v0.0.4 repo tree (9 files moved to docs/archive/).
  • 2026-03-10: Refactor audit pipeline designed — Codex 5.4 + Claude Opus two-phase audit across C++ plugin, Go control plane, React dock, and srtla forks. Results published to Confluence, Jira, and Slack.
  • 2026-03-19: v0.0.5 revised plan published — Connection List as primary UI model, BYOR requires no account, multi-connection ConnectionManager replaces RelayClient singleton. free plan tier dropped; BYOR is the offline default.
  • 2026-03-22: AWS references retired, pool relay model adopted on Advin VPS.
  • 2026-03-23: Always-Ready Relay deployed — phases 1-5 of v0.0.5 plan complete.

Current State

v0.0.5 is the active version. The architecture is a single native C++ OBS plugin DLL (telemy-obs-plugin.dll) with no external process dependencies. The relay infrastructure has moved from AWS EC2 to a pool relay model on Advin VPS using the srtla-receiver Docker image. The Connection List UI model supports both BYOR (no account, local-only config) and Telemy-managed relays (requires login). The v0.0.5 documentation index lives at telemy-v0.0.5/docs/README.md and includes authoritative specs for architecture, API, auth/entitlement, state machine, DB schema, and relay deployment.

All v0.0.3-era bridge/IPC documents have been archived. The v0.0.4 audit is fully resolved (12/13 findings fixed, 1 acknowledged by design). The refactor audit pipeline outputs exist in docs/refactor-audit/ with Confluence/Jira integration.

Key Decisions

  • 2026-02-22: Switch from AEGIS_RELAY_PROVIDER=fake to aws for live relay provisioning — validated the full provision/deprovision cycle before client maturity.
  • 2026-02-23: Prioritize hybrid plugin path (OBS plugin + Rust core IPC) over browser dashboard — browser dashboard kept as transitional debug surface only.
  • 2026-02-23: Use QTimer deferred dock show pattern as standard dock initialization — respects OBS DockState serialization lifecycle.
  • 2026-03-02: Architecture pivot to all-native C++ plugin DLL — eliminated Rust bridge, IPC layer, and all Rust toolchain dependencies. Rationale: single DLL deployment, lower latency, direct OBS C API access.
  • 2026-03-08: Cleartext relay telemetry endpoints (:8090, :5080) acknowledged as by-design — private VPC security group restricts access, adding TLS to ephemeral relays adds complexity without meaningful security benefit.
  • 2026-03-08: Elastic IP per user per region for relay instances — one EIP allocated on first provision, DNS record permanent (never deleted on deprovision). Cost: free when attached, $3.60/mo idle.
  • 2026-03-08: v0.0.3-era docs archived out of pushed v0.0.4 repo tree — 9 files (IPC protocol, code reviews, QA checklists) moved to docs/archive/ to keep canonical docs current.
  • 2026-03-10: Multi-agent audit pipeline design — Codex 5.4 generates raw audit, Claude Opus reviews/augments, merged plan published to Atlassian + Slack.
  • 2026-03-19: BYOR requires no account — zero friction adoption; OBS plugin utility is valuable even without Telemy infrastructure. free plan tier concept dropped entirely.
  • 2026-03-19: Connections as primary data model, not single relay config — enables multi-cam, BYOR+managed hybrid, and future failover use cases.
  • 2026-03-19: Sensitive connection fields (host, port, stream ID) stored DPAPI-encrypted in vault.json keyed by connection ID — config.json stores only non-sensitive metadata.
  • 2026-03-22: AWS relay model retired in favor of pool relay model on Advin VPS — referenced in docs/README.md update.

Experiments & Results

ExperimentStatusFindingSource
Qt/WebEngine dock host in OBSBlockedRuntime compatibility blocker — scaffold fallback path used instead. CEF browser dock became the production path.HANDOFF_HISTORY.md
Strict relay validation with AfterTimestamp onlyResolvedLog generation conditions blocked strict evidence. Filtered Fallback Strategy adopted, then superseded by validated terminal result path.TRIAGE_RELAY_VALIDATION.md
E2E relay telemetry (IRL Pro to dock)ValidatedStable 4.6 Mbps aggregate, 53ms RTT, 1000ms latency. No code changes needed — validation-only session.GEMINI_RELAY.md
Elastic IP for mobile DNS cachingDeployedIRL Pro on iOS caches stale IPs. EIP allocation on first provision solved instant-connect on subsequent provisions. Fallback to auto-assigned IP if EIP allocation fails.TELEMY_v0.0.4_AUDIT_STATUS_2026_03_08.md
Multi-agent refactor audit (Codex 5.4 + Claude Opus)DesignedPipeline: Codex raw audit, Opus review/augment, merged plan. Published to Confluence/Jira/Slack. Scope: 12 C++ source files, ~10 JS/JSX files, Go control plane, 2 srtla forks.2026-03-10-refactor-audit-pipeline-design.md
MutexExt poison recovery patternShippedReplaced 36 lock().unwrap() calls with MutexExt trait in util.rs. Prevents panics from poisoned mutexes.GEMINI_RELAY.md
Delta-based instantaneous bitrateShippedReplaced session-average bitrate with delta-based calculation for real-time accuracy.GEMINI_RELAY.md

Gotchas & Known Issues

  • OBS CEF constraints: IIFEs can crash CEF, clipboard API is broken. Dock JS runs in a restricted CEF browser context, not a standard browser environment.
  • document.title bridge transport race: CEF titleChanged events can be coalesced or delayed during initial page load. Workaround: SelfTestDirectPluginIntake parameter for automated validation.
  • OBS in-process relay HTTP timeout: relay/start calls from within OBS process occasionally hit ~15s timeout during heavy OBS startup. CLI calls succeed instantly — the protocol and auth layers are healthy.
  • Relay telemetry cleartext: Stats endpoints (:8090, :5080) are intentionally cleartext on private VPC. Security group sg-0da8cf50c2fd72518 restricts access. Future hardening could add relay-side auth tokens if threat model changes.
  • EIP idle cost: Elastic IP not attached to a running instance costs 3.60/mo per user). Must be factored into pricing.
  • Config migration between versions: v0.0.4 plugin config path moved from aegis-obs-shim to aegis-obs-plugin within OBS plugin_config directory. v0.0.3 Rust bridge and IPC protocol are incompatible with v0.0.4+.
  • sls-management-ui Docker image: Pinned to sha256:2cd2c4ea05bd75144b3b30f735e62665dbb8c1352245e5b8f994790582cff007 as of 2026-03-08. Digest pinning required manual registry lookup.
  • Gemini file ownership: Gemini CLI must not edit active implementation files (C++ plugin src, dock JSX, bridge JS). Restricted to docs, summaries, triage, QA artifacts.

Open Questions

  • Per-link relay telemetry requires srtla_rec fork to expose per-link metadata — status of this fork work is not documented in current sources.
  • Video stabilization (originally Phase 6 in v0.0.5 plan) deferred to v0.0.6 — no timeline or spec exists.
  • Multi-tenant collaboration use case (each streamer manages own connection entry) mentioned in v0.0.5 vision but no implementation details beyond the Connection List model.
  • Failover/backup relay (ordered priority list, automatic failover between connections) listed as future capability — no spec or timeline.
  • Store approval for Phase 5a billing is a documented blocker — current status unclear from these sources.

Sources

  • HANDOFF_HISTORY.md
  • HANDOFF_STATUS_ARCHIVE.md
  • GEMINI_RELAY.md
  • CHANGELOG-v0.0.4.md
  • QA-v0.0.4.md
  • TELEMY_v0.0.4_AUDIT_STATUS_2026_03_08.md
  • TRIAGE_RELAY_VALIDATION.md
  • 2026-03-19-v005-plan-revised.md
  • docs/README.md
  • 2026-03-10-refactor-audit-pipeline-design.md
  • 2026-03-10-refactor-audit-pipeline.md
  • archive/plans/README.md
  • telemy-v0.0.4-removed-docs-2026-03-08/README.md