How your data is handled
Frame Check is built around a single principle: the corpus is privacy-preserving because of what it never receives, not because of what we promise to do with what it does receive. Privacy by construction instead of by policy. This page enumerates the specific consequences of that principle and points at the versioned engineering document that enforces it.
What is never stored
None of the following is written to persistent storage by any part of Frame Check, regardless of whether you are running a single-text check, a two-text comparison, or enabling Frame Mirror:
- The text you paste. Not the full text, not the first N characters, not a hash of it, not a fingerprint that could partially reconstruct it.
- The text of any reference or source material you paste alongside your text.
- Your topic string when using the two-text compare.
- Your IP address. It is used in-memory by the rate limiter and then discarded. It never reaches the events log or any corpus export.
- Session cookies, device fingerprints, User-Agent strings, or any browser metadata.
- Email addresses. Frame Check does not ask for one.
- Claim sentences, claim context windows, claim section headings, or any fragment of claim text.
- Source URLs visited during verification. Only the categorical name of the provider (SEC EDGAR, FRED, World Bank, Wikipedia) is retained, never the specific URL.
- Search query strings issued against any external provider during verification.
- Raw model response text from any LLM call. Only structural derivatives.
What is stored
Each completed analysis writes a single aggregate event to an internal SQLite database. That event contains coarse counts (claim counts, source provider counts) and categorical codes (coverage flags, voice class, document type class). No field in that event can be linked back to you or to the specific text you submitted. The full field list is enumerated in the engineering document linked at the bottom of this page.
PII and credential intake awareness
Frame Check is a structural analyzer for argumentative
text. It is not the right tool for personal records,
secrets, or credentials, and it has no use for that material
even structurally. To make this visible at the moment it
matters, the form scans pasted text for patterns that almost
always indicate sensitive content: email addresses, US Social
Security number formats, Luhn-validated payment card numbers,
E.164 phone numbers, and prefixed API credentials such as
sk-..., ghp_..., AIza...,
and AKIA.... When any of these are present:
- The result page renders a yellow notice naming the category of pattern detected (e.g. "email addresses"), so you know we noticed and so a downstream agent reading the JSON response can branch on it.
- The 24-hour V4.2 reasoning cache is bypassed for that request. PII-bearing reasoning never persists to the on-disk cache file. The same input resubmitted later does not benefit from caching, by design.
- The detected substrings are never returned
in the response, never logged, and never written to the
aggregate event. Only the count per category is exposed.
A text containing one email address yields
{"email": 1}; the address itself is discarded. - Patterns that look like credentials are detected even when they appear inside otherwise-legitimate text (e.g. a leaked key in a code block), and the same cache-bypass and category-only-reporting rules apply.
The notice is awareness, not a refusal. If you have a real reason to paste text containing such patterns, you still get the structural analysis. The detector exists to prevent silent retention, not to filter input.
Saved analyses and saved comparisons
When you click "Save" on a results page or a compare page,
Frame Check writes the analysis output to disk at a URL of
the form /saved/{hash} or
/compare/saved/{hash}. The saved file contains:
- The structural measurements computed from your input (coverage map, voice classification, temporal orientation, epistemic basis, claim counts).
- The V4.2 LLM-judge frame panel (per-frame verdicts and reasoning) if V4.2 ran on the analysis.
- For single-text saves, an annotated rendering of your text with claim numbers highlighted. This IS your text plus the markup overlay; the recipient of your share URL sees the highlighted text the original author saw. Source material and topic strings (when provided) are NOT separately persisted: only the text you analyzed appears in the annotated render.
- For two-text compare saves, the per-text structural measurements and the cross-text comparison output. Text section labels are kept; the raw text bodies are NOT persisted on disk (only the truncated text used to render each side's analysis card survives).
What is NOT in the saved file: your IP address, the topic string for compare, source-material text (single-doc), or any field that could be linked back to you. Round-3 audit (2026-05-01) closed the gaps where prior schema versions retained these.
Every saved entry auto-deletes after 30 days. The URL stops working after that, and the file is removed from disk. A background sweep runs hourly so cleanup happens regardless of traffic.
The save URL is sha256(your inputs)[:16] truncated. It is unguessable by random brute force (~1019 possibilities) but is deterministic: anyone who knows the exact text you analyzed can compute the same URL. Treat the URL as confidential when the underlying input is confidential. The endpoint is not authenticated; anyone you share the URL with can read the saved analysis until it expires.
Frame Mirror
Frame Mirror is opt-in. If you enable it, the browser you are using generates a random identifier and stores it locally. That identifier is the only thing that links your records to your Mirror view. We do not ask for an email and do not try to correlate the identifier with anything outside your browser.
Mirror stores three kinds of data, all keyed only to the random identifier above. The full field list is enumerated in the engineering document linked at the bottom of this page; the summary follows.
- Session summaries. One row per analysis you run while Mirror is enabled. Structural derivatives only: which frames were detected, which analytical perspectives were missing, the voice classification, the sentence count, the temporal orientation. The text is not stored, the topic string is not stored, and no field can be linked back to the specific text you submitted.
- Decision and outcome text. If you use the "track your decision" panel on a result page, or the return-visit nudge that surfaces unresolved decisions on later /check visits, the prose you type into the decision and outcome boxes is stored verbatim. Each field is capped at 500 characters. The schema does not parse, embed, or send this prose to any third party; it is read back to you on the Mirror page and on the saved analysis the decision was tied to, and (in aggregate, and as raw excerpts when we enable it) by our review process described below. You can delete a single decision row from the Mirror page, or delete every Mirror row at once via the "delete all" button on the same page.
- Falsification flags. If you click "Disagree with this frame?" on a verdict and submit a reason, three user-supplied fields are stored along with timestamp and session linkage: the FVS frame id, the disagreement category (one of detection, reasoning, frame_choice), and the free-form reason text you typed (capped at 500 characters). Aggregate flags drive methodology evolution: when many users flag the same FVS as a misfire, our review process surfaces the pattern so the detection rule can be revisited. You can delete a single flag row from the Mirror page, or delete every Mirror row at once via the "delete all" button on the same page; a deleted row no longer contributes to our aggregate count.
What is NOT in any of the three streams: your IP address, the text, the topic string, your User-Agent, your email (none is collected), or any field that ties the random identifier to anything outside your browser.
Two consequences worth stating explicitly. First, the prose you type into the decision, outcome, and falsification boxes is read by us. The default mode of our review process counts only (flags-per-FVS, outcome rate, unresolved decisions in a window); an opt-in mode dumps recent reason excerpts so we can see what users are actually saying. Reason excerpts are tagged with an eight-character prefix of the random identifier, never the full identifier. Second, the random identifier lives in your browser cookies; clearing cookies severs your link to your Mirror records but does not delete the underlying rows, which remain in the database under the cleared identifier and become unreachable from any surface. To delete the rows themselves, click "delete all" on the Mirror page before clearing your cookies.
You can export everything Frame Mirror has recorded (sessions, decisions, falsifications), or delete it all across every table, from the Mirror page.
Third-party services
Running an analysis may cause Frame Check to call external services on your behalf. Each call is bounded to the structural work that service performs and is invoked only when the relevant feature is active.
Source-network verification providers
SEC EDGAR, FRED, World Bank, REST Countries, Alpha Vantage, Wolfram Alpha, Wikipedia, CoinGecko, and Brave Search are queried for numeric claims in your text. The default query contains the structured claim (entity, metric, value) derived from the text, not the text itself. No account or API key of yours is involved.
Two implementation details that protect against accidental text exposure:
- Brave Search runs a two-stage entity gate before any outbound query. Bare numeric runs without a real entity subject (which include phone numbers, SSN-shaped sequences, and raw card numbers when they happen to be extracted as numeric claims) are blocked at the gate and never become a search query. Verified live in the round-3 audit (2026-05-01).
- When the structured decomposition for a claim is weak (subject extracted but metric missing), Brave Search may receive the cleaned claim sentence (truncated to 150 characters) instead of the structured triple. The entity gate above still applies: a sentence whose subject is not a real entity does not produce an outbound query.
LLM providers (Google Gemini, xAI/Grok)
Frame Check has four LLM call paths. Each is configured independently on the deploy and each transmits a different slice of your input. We list the paths explicitly so a privacy-conscious reader can see which features cause which transmissions:
- V4.2 framing engine (xAI/Grok). Per-frame LLM-judge analysis on every paste. Sends the full text up to the configured 20,000-character cap, in one prompt. Rate-limited and cost-capped per IP.
- AI-assisted interpretation (xAI/Grok). The narrative section under the result page. Sends the full text for inputs at or below 3,000 characters; for longer inputs it sends an excerpt (first 2,000 plus last 1,000 characters) with a labeled middle-section omission marker. Optional, fires only when the deploy has the key configured.
- Reframe (xAI/Grok). When the user clicks "See it reframed" on a detected frame, sends the full text with the FVS counter-frame instructions. Rate- limited per IP per day (default: 3 reframes per IP).
- Compare-framing (xAI/Grok). Optional narrative on the compare page that contrasts how two texts frame the same topic. Sends the first 800 characters of each text plus the structural measurements both texts already produced.
On every path: PII detected by our intake scanner (email, SSN, phone, card, API credential) is replaced with category placeholders BEFORE the prompt is built, so the substring never leaves your browser session in raw form. The provider sees the redacted text, not the original. Provider retention is governed by their own terms.
Users who want the structural analysis without any LLM-side processing can run Frame Check on a deploy without LLM keys configured (or in MCP form locally). In that mode no text you paste leaves your machine.
Cloudflare
Production requests transit Cloudflare as the CDN and WAF in
front of frame.clarethium.com. Cloudflare logs
source IP, User-Agent, request path, and response codes per
their privacy policy and retention windows. Frame Check sees
Cloudflare's logs only during incident response; their content
is governed by Cloudflare's terms, not by the privacy-by-
construction posture this page describes.
Cloudflare Turnstile is also used for bot detection on form submission. Turnstile's Privacy Policy governs that component.
Corpus exports
Frame Check publishes daily NDJSON and Parquet exports of the Observatory and verification corpus at /corpus/ on this domain (currently paused; production stopped 2026-04-23 pending operator authorization to resume). The Observatory's Tier B topic-cycle stream is paused under Option D (2026-04-22); only the Tier A user-analysis stream collects when the service is running. User analyses are recorded only at the aggregate tier described above, and that tier is the one whose shape is enforced by the privacy principle.
Requests and deletion
Because Frame Check stores no field that can be linked to an individual, there is nothing to delete in the main corpus on request. Frame Mirror, the one surface where per-browser data exists, has an explicit delete button on the Mirror page that clears the browser identifier and every record associated with it. Saved analyses are addressable by their hash URL; if you want a save taken down before its 30-day expiry, send the URL to [email protected].
Technical reference
What this page commits to is written up in more detail here: