Cloudflare Publish Audit

Last reviewed: 2026-04-18 (LAPTOP-KAI)

Investigation of how /signal-crawl pushes newsletters to debrief.pentz.io. Context: before rotating the Cloudflare API token, confirm which publish path is actually in use, whether auxiliary paths still work, and whether KV state matches the Postgres digests table. Account and namespace identifiers are redacted to trailing 4 characters below; full values live in ~/signal-feed.env on the laptop and in /opt/signal-feed/.env on Advin.

Publish path actually in use

Direct Cloudflare REST API via curl. Endpoint pattern:

PUT https://api.cloudflare.com/client/v4/accounts/{acct:…ab47}/storage/kv/namespaces/{ns:…c095}/values/{key}
Authorization: Bearer $CF_API_TOKEN

Three keys are written per run: edition:<ISO_TS>, editions (JSON array, newest-first), latest (plain text pointer). The wrangler alternative documented in the skill is not used. On LAPTOP-KAI there is no ~/.wrangler/, no signal-feed/worker/ subdir in the repo, and no local wrangler install. CF_API_TOKEN is the only credential in play and is sourced from ~/signal-feed.env via ~/.bashrc. CLOUDFLARE_API_TOKEN (wrangler’s preferred env var) is unset and irrelevant on this host.

publish.py is dead code

src/signal_feed/publish.py is 45 lines. It targets the Cloudflare Pages Direct Upload endpoint (/accounts/{acct}/pages/projects/{proj}/deployments), not Workers KV. Exports one function, publish_to_cloudflare(), which has zero callers in the repo. generate.py does not import it. The sf-generate console script (pyproject.tomlsignal_feed.generate:main) does not reach it. The file contains no references to CF_KV_NAMESPACE_ID despite earlier speculation; that env var is unused anywhere in the repo.

Implication: the CF_PAGES_PROJECT=signal-feed Pages project from the v0.1 design is decoupled from the current publish flow. Safe to delete publish.py once Advin’s deployed copy is confirmed identical, or keep it as a documented fallback with a header comment noting it is unused.

KV state vs digests table

Snapshot 2026-04-18 03:25 UTC.

KV keys in namespace {ns:…c095}:

  • edition:2026-04-15T00:00:00+00:00
  • edition:2026-04-16T03:59:00+00:00
  • edition:2026-04-16T12:00:00+00:00
  • edition:2026-04-18T01:46:37+00:00
  • editions, latest

Postgres digests table (Advin, glb-postgres container):

edition_dateedition_tshtml_bytesposts
2026-04-182026-04-18 01:46:37+002852826
2026-04-162026-04-16 12:00:00+00265465
2026-04-152026-04-16 03:59:00+002554227
2026-04-152026-04-15 00:00:00+002667227

Consistency: KV edition keys and digests.edition_ts match 1:1 across all four rows. latest correctly resolves to 2026-04-18T01:46:37+00:00.

Drift: editions only contains two entries, ["2026-04-18T01:46:37+00:00", "2026-04-16T12:00:00+00:00"]. The two earlier editions are orphaned from the dropdown on debrief.pentz.io, though still retrievable by direct KV key. Root cause is likely an early run where the “read current editions, append, write back” step ran before all keys were tracked. Repair: list edition:* keys from the KV API, sort descending, PUT the result back to the editions key.

Access and reachability

debrief.pentz.io is behind Cloudflare Access (Zero Trust, team allysoftllc). Anonymous curl -I returns 302 to allysoftllc.cloudflareaccess.com. No anonymous content verification is possible from outside. The latest KV value being current is the best external signal that a given run succeeded.

Token rotation impact on LAPTOP-KAI

Only the curl publish path inside /signal-crawl consumes CF_API_TOKEN on this host. Rotating the token requires updating ~/signal-feed.env on the laptop and /opt/signal-feed/.env on Advin. No wrangler-cached OAuth token exists on this machine; no worker-local config; no other .env file carries CF credentials (only ~/src/signal-feed/.env.example).

Open items

  • Rebuild the editions KV value from the current KV key listing so the site dropdown shows all four editions.
  • Decide whether publish.py stays as a documented fallback or gets removed. If it stays, add a top-of-file comment noting it is unused in the current pipeline.
  • Only install wrangler + CLOUDFLARE_API_TOKEN on LAPTOP-KAI if wrangler is wanted as a sanctioned backup. Otherwise leaving it off prevents accidental OAuth prompts under cron.