Chat

Summary

Phase 5b adds a server-side chat transport layer that connects Twitch and YouTube live chat to the OBS plugin’s chatbot runtime, enabling viewers to trigger scene switches via chat commands. The Go control plane receives Twitch EventSub webhook notifications (using the Conduit transport with up to 20,000 shards) and polls YouTube’s liveChatMessages API, evaluates commands server-side against the user’s chatbot config, sends bot replies immediately via platform APIs, and queues scene-switch commands for the OBS plugin to pick up via a lightweight GET /api/v1/chat/pending poll every 1.5 seconds. The chatbot is free for all users with no entitlement check.

The architecture deliberately keeps command evaluation on the server rather than relaying raw chat to the plugin. This means bot replies are instant (no round-trip to OBS), the plugin only acts on scene switches, the server already has user config from heartbeat sync, and the system works even when OBS is temporarily disconnected. A single central server handles chat (latency tolerance is seconds, not milliseconds like video), and all chat components run inside the existing telemy-api and telemy-jobs containers on the Advin server (4 CPU, 8GB RAM, 90% idle), which can handle 500+ concurrent chatbot users before scaling triggers activate.

The implementation spans 8 new Go source files, 3 Go test files, 1 Postgres migration, and modifications to approximately 2 C++ plugin files plus the dock JSX. Twitch integration is the primary path; YouTube is deferred until after Twitch works end-to-end due to a critical quota constraint (10,000 API units/day). The Go evaluator is a direct port of the C++ ChatbotRuntime::HandleCommand logic, supporting commands: status, scene <name>, auto (resume automation), and alias-based scene switching.

Timeline

  • 2026-03-25: Phase 5b chat transport design approved. Architecture decided: server-side evaluation, EventSub Conduit for Twitch, polling for YouTube, in-memory command queue with 30s TTL.
  • 2026-03-25: Implementation plan written for transport layer. 12 tasks across 6 phases (A-F): core chat package, Twitch integration, OBS plugin integration, EventSub management, YouTube (deferred), deploy/E2E.
  • 2026-03-26: Chat wiring design approved. End-to-end flow defined: dock toggle triggers POST /chat/config, backend sets up EventSub, webhook evaluates messages, plugin polls pending commands.
  • 2026-03-26: Wiring implementation plan written. 10 tasks: store CRUD, bot-add helper, config store updates, handleChatConfig endpoint, teardown on unlink, differentiated replies, C++ toggle call, auto-announce, dock UI, deploy/E2E.

Current State

The design and implementation plans are approved and specified in detail. The transport plan defines 12 implementation tasks and the wiring plan defines 10 tasks. Key components to build:

  • Go internal/chat/ package: evaluator, command queue (30s TTL), config store, Twitch webhook signature verification, EventSub conduit/subscription manager, Helix chat sender, YouTube poller (deferred).
  • New API routes: POST /chat/twitch/webhook (HMAC-authed), GET /chat/pending (JWT), POST /chat/config (JWT), POST /chat/disable (JWT), POST /chat/announce (JWT, rate-limited 5/min).
  • New DB table: chat_subscriptions (migration 0024) with columns for user_id, provider, channel_id, eventsub_subscription_id, status. Unique index on (user_id, provider).
  • C++ plugin changes: command poll in relay_client.cpp (1.5s interval), POST /chat/config call on dock toggle with 3-second debounce, POST /chat/announce for auto-rule and auto-resume notifications.
  • Dock UI changes: status label below toggle (Connecting/Active/Error states), role dropdown replacing broadcasterOnly checkbox (Broadcaster only / Broadcaster + Mods / Everyone), 3-second toggle debounce.
  • 8 new environment variables needed on Advin for bot credentials, EventSub secret, and YouTube client.

Key Decisions

  • 2026-03-25: Server-side command evaluation instead of relaying chat to plugin — instant bot replies, plugin only executes scene switches, works when OBS is disconnected.
  • 2026-03-25: Single central server for chat (not regional) — chat latency tolerance is seconds; cross-Pacific adds only ~150ms, imperceptible in chat context.
  • 2026-03-25: Same containers (telemy-api, telemy-jobs) rather than a new service — webhook and poll are just new routes; current server is 90% idle.
  • 2026-03-25: Twitch EventSub Conduit (webhook transport) over legacy IRC — Twitch recommends EventSub for new development; no persistent connections to manage.
  • 2026-03-25: YouTube replies sent as the user (not a bot account) — YouTube has no “bot joins channel” model; the bot would need explicit invitation to each live chat.
  • 2026-03-25: In-memory config and command storage (not DB) — config is authoritative in the plugin and synced on connect; commands are ephemeral with 30s TTL.
  • 2026-03-25: YouTube transport deferred until Twitch works E2E — 10K unit/day quota makes it impractical without a Google quota extension.
  • 2026-03-26: Channel is always the authenticated user’s own Twitch channel (no free-text input) — prevents trolling by forcing channel identity from the linked OAuth account.
  • 2026-03-26: Toggle ON first time does heavy setup (add bot, create conduit, subscribe EventSub); subsequent toggles are lightweight flag flips — idempotent, repeated enable calls are no-ops.
  • 2026-03-26: Disable does not tear down EventSub subscription — sets ConfigStore enabled=false so webhook handler skips evaluation; subscription stays active for fast re-enable.
  • 2026-03-26: Chatbot is free for all users — no entitlement check required.
  • 2026-03-26: Scene switch announcements differentiate source: chat command (“requested by @michael”), auto-rule (“low bitrate detected”), auto-resume (“bitrate recovered”).

Experiments & Results

ExperimentStatusFindingSource
Twitch EventSub Conduit webhook transportDesign validated1 conduit with 2-3 shards handles thousands of users at launch; scales to 20,000 shardstransport-design
YouTube liveChatMessages pollingQuota-blocked10,000 units/day = ~2,000 polls = exhausted in ~1 hour at 5s intervals for a single stream; requires quota extensiontransport-design
Advin server capacity for chatEstimated sufficient4 CPU, 8GB RAM, 90% idle; handles 500+ chatbot users before needing Redis cache or dedicated DBtransport-design
Command evaluator Go portTest-driven6 unit tests defined: no-prefix, status, scene switch, broadcaster-only, direct scene, auto-resume; tests written before implementationtransport-plan
Command queue TTL expiryTest-driven50ms TTL test confirms expired commands are dropped on drain; 30s production TTLtransport-plan
EventSub HMAC-SHA256 verificationTest-driven3 test cases: valid signature, invalid signature, empty secrettransport-plan

Gotchas & Known Issues

  • YouTube quota exhaustion: 10,000 units/day at 5 units/poll = 2,000 polls total. A single stream at 5s intervals exhausts the daily quota in about 1 hour. Must request Google API quota extension before YouTube goes to production.
  • Twitch token refresh on bot-add: If the broadcaster’s token is expired when adding the bot to the channel, the system must refresh the token and retry. Failure path returns 502 with “try re-linking Twitch.”
  • Twitch token revocation: If a user revokes their Twitch token, the bot silently stops replying. The user must re-link their Twitch account to fix it. EventSub sends a revocation notification that is logged but not yet surfaced to the user.
  • Re-link to different Twitch account: Stale EventSub subscriptions for the old channel_id must be cleaned up when the user re-links to a different account.
  • POST /chat/config rate limit: 10/min per user (existing per-action limiter). POST /chat/announce is rate-limited to 5/min per user. Dock toggle has a 3-second UI debounce on top of server-side limits.
  • Auto-announce is fire-and-forget: POST /chat/announce from the C++ plugin has no retry. If the request fails, the chat announcement is silently dropped.
  • Naming migration: All new code uses telemy-* naming. No “aegis” in new code. Existing containers keep their names until the full rebrand.
  • OBS plugin HTTPS client pattern: The C++ command poll implementation depends on the existing HttpsClient patterns in relay_client.cpp and https_client.cpp. The implementer must read those files for the correct approach.
  • Twitch OAuth scopes: The broadcaster needs channel:bot scope (already included in existing OAuth flow). TelemyBot needs user:bot, user:write:chat, and user:read:chat (one-time setup).
  • handleChatPending signature: The transport plan uses auth.UserIDFromContext returning a single string, while the wiring plan uses a two-return form (userID, ok). The actual signature must match the codebase.

Open Questions

  • When will the Google API quota extension be requested for YouTube live chat polling?
  • Should EventSub revocation notifications be surfaced to the user in the dock UI, or just logged?
  • What is the TelemyBot Twitch user ID? (Referenced as needing retrieval from Twitch API during deploy.)
  • Should the POST /chat/config endpoint support YouTube provider selection, or is Twitch-only sufficient for initial launch?
  • When will the long-polling upgrade for /chat/pending be considered? (Scaling trigger: 2000+ req/s on command poll.)
  • Should the command queue have a max depth per user to prevent memory issues if the plugin is disconnected for extended periods?

Sources

  • 2026-03-25-phase5b-chat-transport-design.md
  • 2026-03-25-phase5b-chat-transport-plan.md
  • 2026-03-26-phase5b-chat-wiring-design.md
  • 2026-03-26-phase5b-chat-wiring-plan.md