Privacy Model

Summary

ToneForge’s privacy model is built around a single architectural guarantee: “Your writing passes through. Your profile is yours. We can’t read it.” This is enforced through data classification rules and client-side encryption, not policy. Raw text submitted for analysis never persists on the server. During Track 2 analysis, text arrives over TLS, forwards to the LLM provider, the assessment is returned, and the text is discarded from memory. No logging, no database write, no temp file.

Profile data (the substrate) is encrypted client-side before upload using a user-supplied passphrase and Argon2id key derivation. The server stores an opaque encrypted blob plus plaintext metadata. The encryption key never leaves the client. Other devices download the blob and decrypt locally. The server operator cannot read any user’s profile under this design.

Track 1 (local-only mode) eliminates cloud exposure entirely. It runs with no network calls, is fully open source and auditable, and can operate airgapped. Track 2 requires cloud for LLM analysis but applies mitigations: TLS, a pass-through server architecture or direct-to-LLM with the user’s own API key, and Anthropic’s 30-day API retention policy (abuse monitoring only, not training). ToneForge is the only tool in this category where the operator cannot read the user’s profile and where a fully local mode with zero cloud dependency exists.

Timeline

  • 2026-04-13: Privacy model formalized in v2 research plan. Data classification table defined. Client-side encryption architecture specified. Metadata logging boundaries documented.

Current State

Four data categories are defined with explicit handling rules:

CategoryLocationOperator-readable
Raw textIn-memory only during analysisBriefly, during forwarding
Analysis outputClient-side onlyNo
SubstrateEncrypted blob on sync serverNo
MetadataPlaintext on serverYes

Metadata logged: API key hash, timestamps, word count, source type, track used, latency. Not logged: text content, analysis results, substrate contents, injection targets. Track 1 is implementable now. Track 2 privacy architecture is designed but the server proxy vs. direct-to-LLM question is unresolved.

Key Decisions

  • 2026-04-13: Raw text never persists on server — enforced as architectural rule, not a policy. Makes compliance and audits simpler; no data retention risk.
  • 2026-04-13: Client-side encryption with Argon2id for substrate sync — server stores opaque blob so operator cannot read profiles even with database access.
  • 2026-04-13: Track 1 local-only mode included as a first-class path — enables airgapped use, open source auditability, and differentiates from all competitors.

Open Questions

  • Direct-to-LLM vs. server proxy for Track 2: proxy keeps API keys server-side but sees plaintext text in transit; direct-to-LLM exposes user’s API key but removes server from the data path.
  • Substrate compilation location: client-side exposes the algorithm, server-side requires trusting the operator.
  • Opt-in diagnostic mode: needed for troubleshooting but creates a logging exception that weakens the privacy story.
  • Regulatory posture: GDPR compliance, data processing agreements, jurisdiction of sync server not yet evaluated.

Sources