About Frame Check

Frame Check shows you the frame any text takes: which perspectives it leans on, where its coverage is thin, and how it positions you as the reader. It also checks numerical claims against authoritative sources where it can.

It came out of a pile of small experiments on how AI frames things. The methodology and the frame vocabulary behind it are written up openly, and the code is open source, but this is a working tool, not a research product. Framing analysis is the point; verification is a floor, bounded to what holds up for free.

The problem

Better AI models produce more persuasive framing, not less biased framing. More fluent. More confident. Harder to see through. A 2,000-word analysis that mentions risk once and growth fourteen times feels comprehensive. The ratio is invisible to casual reading. Every claim is presented with identical confidence, whether it comes from training data or was generated on the spot.

How it works

Structural framing analysis (the core)

Five deterministic measurements compute the structural frame: analytical coverage across five categories with density per 1,000 words, temporal orientation, voice classification across five registers, epistemic basis, and claim-density structure. A synthesis portrait describes what the text does to the reader; a separate rule layer suggests named frame patterns from the library when their identification cues fire.

When a structural pattern matches a named frame from the frame vocabulary, Frame Check names the frame and asks a teaching question, like "what would a risk analyst say about this same data?" The question is the cognitive move; the library entry explains how to read past the frame.

Computational measurement only. No LLM in the framing layer.

Verification (supporting instrument)

Numerical claims are checked against authoritative data sources: SEC EDGAR (US company filings, annual and quarterly), World Bank, FRED, REST Countries, CoinGecko, Alpha Vantage, Wolfram Alpha, Wikipedia, and Brave Search as fallback.

Verification anchors framing findings to empirical reality where sources exist. It does not drive the analysis. Coverage varies by domain: financial claims with named companies and time periods have the strongest coverage; medical, legal, and niche-domain claims are often unresolved. Frame Check reports what it checked, what it verified, and what it could not reach at the same prominence as its findings.

Compare page

Paste two texts on the same subject to see where their framing diverges structurally. Frame Check analyzes each text independently and reports coverage differences, voice differences, shared blind spots (perspectives where neither text shows detected markers), numerical agreements, and numerical disagreements. Topic-generation mode (asking models to produce responses on a prompt) was removed at launch in favor of the text-comparison path; it may return when an account-based premium tier ships.

Web vs MCP: surface asymmetry

Frame Check ships through two surfaces: this web product, and a Model Context Protocol server (frame-check-mcp) that agent runtimes can call directly. Both surfaces compose the same deterministic substrate. Both produce identical FVS frame matches, voice classification, coverage signal, temporal distribution, and epistemic basis on the same input; a parity test in the repository pins this contract across the corpus.

The MCP surface additionally exposes four analyzers that the web product does not yet surface: a genre classifier (recommendation / analysis / narrative / advocacy / exploration / instruction with honest abstention), per-frame deepening detectors (temporal scope, stakeholder map, falsification conditions), absence clusters that group structurally absent FVS frames into readings, and an opt-in LLM-augmented frame opportunities pass. These analyzers run cleanly on the MCP corpus harness today; they are held back from the web product until each one passes an expert-graded false-positive validation. Surfacing unvalidated classifiers to a public web audience is the failure mode the product cannot afford. The asymmetry is not a defect; it is the validation phase made visible.

JSON API status codes

The JSON endpoints (POST /api/profile, POST /api/reframe) use proper HTTP status codes so a programmatic consumer can branch on response.status_code without parsing the error body:

200 — analysis succeeded; JSON body carries the structural framing portrait.
400 — validation error other than size: missing field, malformed JSON, adversarial input pattern (the substrate rejects extreme runs of digits or repeated characters as ReDoS-class triggers), text below the analyzability floor.
413 — payload too large: the text or source exceeds the configured cap (currently 20,000 / 30,000 characters respectively). The error body names the observed length and the maximum so the consumer can split or hit the MCP long-text surface, which is sized for analytical text up to book-length.
429 — rate limit or daily cost budget exceeded. The Retry-After header indicates when to retry.
504 — analyzer timed out. The request reached the server but the analysis did not complete within the wall-clock budget.

The SSE endpoint (GET /api/compare-stream) emits an error event for every failure mode (missing session, rate limit, budget, circuit breaker) so an EventSource consumer reads the same content-type and event grammar across success and failure paths.

Design principles

The frame is the product. Not "is this true?" but "how does this text position you?" Frame Check names the frame, surfaces gaps in structural coverage, and asks the question that helps you see past it.
Teaching, not verdicts. Every detection comes with a teaching question. The goal is not a trust score. The goal is a reader who thinks differently about what they just read.
Integrity first. Better to honestly report "5 of 8 claims checked, 3 could not be matched" than to overclaim coverage. The tool names its own limits at the same prominence as its findings.
Computational measurement first. Framing analysis is deterministic and reproducible. Verification uses structured APIs. LLM interpretation is optional and clearly labeled.

What it does not measure

Out of scope

Reasoning quality. Logical validity. Whether conclusions follow from premises. Whether analysis is insightful. Whether the output is useful for your specific purpose. These require human judgment. Frame Check can tell you a number is unsourced, contradicted, or unstable, and which analytical dimensions show no detected markers. It cannot tell you whether the argument built on that number is sound.

What Frame Check cannot detect

The verification engines have systematic blindspots. Naming them is part of the tool's integrity: the goal is a reader who knows where the measurement ends, not a reader who trusts a score.

Semantic fabrication with sourced vocabulary.: A sentence built from real terminology (correct entity names, plausible figures, familiar units) can be invented whole. If a database returns a matching number and the entity exists, the check reports "verified" even when the specific composition is wrong. Example: "NVIDIA's data center revenue was $47.5 billion in Q3 2024." If $47.5 billion appears elsewhere (a different segment, a different quarter, an analyst projection), surface-level matching can pass it. The sentence as written can still be fabricated.
Paraphrase that defeats substring matching.: Source-fidelity checking is literal. "Revenue rose 12%" and "The company's revenue increased by 12 percent" are semantically identical, but only the first matches "12%" as a substring. Fuzzy matching catches some of these; semantic paraphrase beyond keyword overlap passes through as unsourced, even when the source supports it.
Derivation correctness, beyond the few we check.: Frame Check's math checker catches explicit arithmetic like "$100B to $130B, a 30% increase." Most derivations are outside its patterns: correlations between series, forecast extrapolations, ratios constructed from multiple figures. If the underlying numbers exist in authoritative sources but the relationship asserted between them is wrong, Frame Check reports verified.
Sentence-level grounding on number-dense sources.: The G/F/P card uses a per-sentence derivation check that becomes noisy as the source accumulates unique numbers: with more than roughly fifteen source numbers, most fabricated values pass the check via coincidental arithmetic match. The card reports the regime ("saturated" vs "diagnostic") so the reliability of that signal is visible per text; on saturated sources, the source-fidelity digit match is the authoritative signal for numerical claims.
Entities outside the source network.: The nine authoritative APIs have coverage edges. Claims about private companies, subsidiaries, historical figures not in Wikipedia, niche scientific parameters, or region- specific data outside the covered databases return as "unverifiable." Unverifiable is not the same as false. It means the tool cannot reach data about the claim, not that the claim is wrong.
Web-published data outside the API network.: Verification queries nine structured APIs (SEC filings, FRED, World Bank, REST Countries, CoinGecko, Alpha Vantage, Wolfram Alpha, Wikipedia, Brave Search). Claims sourced from company websites, press releases, industry reports, or analyst forecasts are outside this coverage even when the numbers are verifiable against their original web source. An AI that cites 75 web sources can produce a text where Frame Check shows "0 verified" because the cited sources are not in the API set. The framing analysis is the primary signal for these texts.
Under-detection of structural markers.: The analytical-coverage and source-attribution signals are vocabulary-and-pattern based. A text may discuss causes, risks, stakeholders, trends, or uncertainty using language the detector does not recognize (for example, implicit causation via "rationale centers on," or scholarly attribution via "observers raise," "analysts argue"). The detector reports no markers even though a reader recognizes the coverage. A "missing" or "no markers detected" flag is therefore a lower-bound claim about what the detector found, not an upper-bound claim about what the text covers. See the methodology for how coverage is measured and where it is weak.
Intent, stakes, and context.: Frame Check measures structural framing (what is emphasized, what is omitted) and claim coverage (what is supported, what is not). It does not judge whether the framing is appropriate for the purpose, whether the omissions are deliberate, or whether the claims matter for the decision at hand. That is the reader's work.

How it's checked

How the detection and verification work is written up in full on the methodology page, so any claim here can be checked or reproduced. What has been tested, what hasn't, and the known limits are all there.

The verification sources each behave differently against known-true claims. A small calibration corpus and a harness to measure each source's precision and recall live in the repo under calibration/, with per-run results at /corpus/calibration/. The reliability tiers shown on verification cards come from those numbers; no confidence badges on uncalibrated data.

How reliable the named-frame matching is gets measured separately, across several models, and the known gaps are written down openly rather than hidden. The methodology page has the numbers and the gap list.

Privacy and the Frame Check Corpus

Every analysis Frame Check runs contributes structural metadata to the Frame Check Corpus, a privacy-respecting open dataset of how frontier AI models frame reality on factual questions. The corpus is licensed CC-BY-4.0 and is hosted in the EU.

What is recorded

Structural derivatives only, never content.

Analysis mode (single, compare, observatory)
Model lineage when known (provider + canonical name)
Coverage counts and category names
Temporal orientation percentages
Voice classification, epistemic basis counts
Number of verified and contradicted claims
Categorical names of sources queried
Aggregate cost in USD and latency in milliseconds
Trust label and trust score

What is never recorded

The schema cannot represent these fields, so they cannot leak.

The text you submitted
The source material you provided
The topic string you typed
Any sentence, phrase, or fragment from your input
Any text from any source response
Any URL of any kind
Your IP address
Your email address
Any device fingerprint or session identifier

Privacy by construction, not by promise. The full schema, the privacy threat model, and the recording principle are documented in the Frame Check Corpus methodology.

Check a text Compare two texts