About Frame Check
Frame Check shows you the frame any text takes: which perspectives it leans on, where its coverage is thin, and how it positions you as the reader. It also checks numerical claims against authoritative sources where it can.
It came out of a pile of small experiments on how AI frames things. The methodology and the frame vocabulary behind it are written up openly, and the code is open source, but this is a working tool, not a research product. Framing analysis is the point; verification is a floor, bounded to what holds up for free.
The problem
Better AI models produce more persuasive framing, not less biased framing. More fluent. More confident. Harder to see through. A 2,000-word analysis that mentions risk once and growth fourteen times feels comprehensive. The ratio is invisible to casual reading. Every claim is presented with identical confidence, whether it comes from training data or was generated on the spot.
How it works
Structural framing analysis (the core)
Five deterministic measurements compute the structural frame: analytical coverage across five categories with density per 1,000 words, temporal orientation, voice classification across five registers, epistemic basis, and claim-density structure. A synthesis portrait describes what the text does to the reader; a separate rule layer suggests named frame patterns from the library when their identification cues fire.
When a structural pattern matches a named frame from the frame vocabulary, Frame Check names the frame and asks a teaching question, like "what would a risk analyst say about this same data?" The question is the cognitive move; the library entry explains how to read past the frame.
Computational measurement only. No LLM in the framing layer.
Verification (supporting instrument)
Numerical claims are checked against authoritative data sources: SEC EDGAR (US company filings, annual and quarterly), World Bank, FRED, REST Countries, CoinGecko, Alpha Vantage, Wolfram Alpha, Wikipedia, and Brave Search as fallback.
Verification anchors framing findings to empirical reality where sources exist. It does not drive the analysis. Coverage varies by domain: financial claims with named companies and time periods have the strongest coverage; medical, legal, and niche-domain claims are often unresolved. Frame Check reports what it checked, what it verified, and what it could not reach at the same prominence as its findings.
Compare page
Paste two texts on the same subject to see where their framing diverges structurally. Frame Check analyzes each text independently and reports coverage differences, voice differences, shared blind spots (perspectives where neither text shows detected markers), numerical agreements, and numerical disagreements. Topic-generation mode (asking models to produce responses on a prompt) was removed at launch in favor of the text-comparison path; it may return when an account-based premium tier ships.
Web vs MCP: surface asymmetry
Frame Check ships through two surfaces: this web product, and a
Model Context Protocol server (frame-check-mcp) that
agent runtimes can call directly. Both surfaces compose the same
deterministic substrate. Both produce identical FVS frame
matches, voice classification, coverage signal, temporal
distribution, and epistemic basis on the same input; a parity
test in the repository pins this contract across the corpus.
The MCP surface additionally exposes four analyzers that the web product does not yet surface: a genre classifier (recommendation / analysis / narrative / advocacy / exploration / instruction with honest abstention), per-frame deepening detectors (temporal scope, stakeholder map, falsification conditions), absence clusters that group structurally absent FVS frames into readings, and an opt-in LLM-augmented frame opportunities pass. These analyzers run cleanly on the MCP corpus harness today; they are held back from the web product until each one passes an expert-graded false-positive validation. Surfacing unvalidated classifiers to a public web audience is the failure mode the product cannot afford. The asymmetry is not a defect; it is the validation phase made visible.
JSON API status codes
The JSON endpoints (POST /api/profile,
POST /api/reframe) use proper HTTP status codes
so a programmatic consumer can branch on
response.status_code without parsing the error
body:
- 200 — analysis succeeded; JSON body carries the structural framing portrait.
- 400 — validation error other than size: missing field, malformed JSON, adversarial input pattern (the substrate rejects extreme runs of digits or repeated characters as ReDoS-class triggers), text below the analyzability floor.
- 413 — payload too large: the text or source exceeds the configured cap (currently 20,000 / 30,000 characters respectively). The error body names the observed length and the maximum so the consumer can split or hit the MCP long-text surface, which is sized for analytical text up to book-length.
-
429 — rate limit or daily cost
budget exceeded. The
Retry-Afterheader indicates when to retry. - 504 — analyzer timed out. The request reached the server but the analysis did not complete within the wall-clock budget.
The SSE endpoint (GET /api/compare-stream)
emits an error event for every failure mode
(missing session, rate limit, budget, circuit breaker) so
an EventSource consumer reads the same content-type and
event grammar across success and failure paths.
Design principles
- The frame is the product. Not "is this true?" but "how does this text position you?" Frame Check names the frame, surfaces gaps in structural coverage, and asks the question that helps you see past it.
- Teaching, not verdicts. Every detection comes with a teaching question. The goal is not a trust score. The goal is a reader who thinks differently about what they just read.
- Integrity first. Better to honestly report "5 of 8 claims checked, 3 could not be matched" than to overclaim coverage. The tool names its own limits at the same prominence as its findings.
- Computational measurement first. Framing analysis is deterministic and reproducible. Verification uses structured APIs. LLM interpretation is optional and clearly labeled.
What it does not measure
Reasoning quality. Logical validity. Whether conclusions follow from premises. Whether analysis is insightful. Whether the output is useful for your specific purpose. These require human judgment. Frame Check can tell you a number is unsourced, contradicted, or unstable, and which analytical dimensions show no detected markers. It cannot tell you whether the argument built on that number is sound.
What Frame Check cannot detect
The verification engines have systematic blindspots. Naming them is part of the tool's integrity: the goal is a reader who knows where the measurement ends, not a reader who trusts a score.
- Semantic fabrication with sourced vocabulary.
- A sentence built from real terminology (correct entity names, plausible figures, familiar units) can be invented whole. If a database returns a matching number and the entity exists, the check reports "verified" even when the specific composition is wrong. Example: "NVIDIA's data center revenue was $47.5 billion in Q3 2024." If $47.5 billion appears elsewhere (a different segment, a different quarter, an analyst projection), surface-level matching can pass it. The sentence as written can still be fabricated.
- Paraphrase that defeats substring matching.
- Source-fidelity checking is literal. "Revenue rose 12%" and "The company's revenue increased by 12 percent" are semantically identical, but only the first matches "12%" as a substring. Fuzzy matching catches some of these; semantic paraphrase beyond keyword overlap passes through as unsourced, even when the source supports it.
- Derivation correctness, beyond the few we check.
- Frame Check's math checker catches explicit arithmetic like "$100B to $130B, a 30% increase." Most derivations are outside its patterns: correlations between series, forecast extrapolations, ratios constructed from multiple figures. If the underlying numbers exist in authoritative sources but the relationship asserted between them is wrong, Frame Check reports verified.
- Sentence-level grounding on number-dense sources.
- The G/F/P card uses a per-sentence derivation check that becomes noisy as the source accumulates unique numbers: with more than roughly fifteen source numbers, most fabricated values pass the check via coincidental arithmetic match. The card reports the regime ("saturated" vs "diagnostic") so the reliability of that signal is visible per text; on saturated sources, the source-fidelity digit match is the authoritative signal for numerical claims.
- Entities outside the source network.
- The nine authoritative APIs have coverage edges. Claims about private companies, subsidiaries, historical figures not in Wikipedia, niche scientific parameters, or region- specific data outside the covered databases return as "unverifiable." Unverifiable is not the same as false. It means the tool cannot reach data about the claim, not that the claim is wrong.
- Web-published data outside the API network.
- Verification queries nine structured APIs (SEC filings, FRED, World Bank, REST Countries, CoinGecko, Alpha Vantage, Wolfram Alpha, Wikipedia, Brave Search). Claims sourced from company websites, press releases, industry reports, or analyst forecasts are outside this coverage even when the numbers are verifiable against their original web source. An AI that cites 75 web sources can produce a text where Frame Check shows "0 verified" because the cited sources are not in the API set. The framing analysis is the primary signal for these texts.
- Under-detection of structural markers.
- The analytical-coverage and source-attribution signals are vocabulary-and-pattern based. A text may discuss causes, risks, stakeholders, trends, or uncertainty using language the detector does not recognize (for example, implicit causation via "rationale centers on," or scholarly attribution via "observers raise," "analysts argue"). The detector reports no markers even though a reader recognizes the coverage. A "missing" or "no markers detected" flag is therefore a lower-bound claim about what the detector found, not an upper-bound claim about what the text covers. See the methodology for how coverage is measured and where it is weak.
- Intent, stakes, and context.
- Frame Check measures structural framing (what is emphasized, what is omitted) and claim coverage (what is supported, what is not). It does not judge whether the framing is appropriate for the purpose, whether the omissions are deliberate, or whether the claims matter for the decision at hand. That is the reader's work.
How it's checked
How the detection and verification work is written up in full on the methodology page, so any claim here can be checked or reproduced. What has been tested, what hasn't, and the known limits are all there.
The verification sources each behave differently against known-true
claims. A small calibration corpus and a harness to measure each
source's precision and recall live in the repo under
calibration/, with per-run results at
/corpus/calibration/. The
reliability tiers shown on verification cards come from those
numbers; no confidence badges on uncalibrated data.
How reliable the named-frame matching is gets measured separately, across several models, and the known gaps are written down openly rather than hidden. The methodology page has the numbers and the gap list.
Privacy and the Frame Check Corpus
Every analysis Frame Check runs contributes structural metadata to the Frame Check Corpus, a privacy-respecting open dataset of how frontier AI models frame reality on factual questions. The corpus is licensed CC-BY-4.0 and is hosted in the EU.
What is recorded
Structural derivatives only, never content.
- Analysis mode (single, compare, observatory)
- Model lineage when known (provider + canonical name)
- Coverage counts and category names
- Temporal orientation percentages
- Voice classification, epistemic basis counts
- Number of verified and contradicted claims
- Categorical names of sources queried
- Aggregate cost in USD and latency in milliseconds
- Trust label and trust score
What is never recorded
The schema cannot represent these fields, so they cannot leak.
- The text you submitted
- The source material you provided
- The topic string you typed
- Any sentence, phrase, or fragment from your input
- Any text from any source response
- Any URL of any kind
- Your IP address
- Your email address
- Any device fingerprint or session identifier
Privacy by construction, not by promise. The full schema, the privacy threat model, and the recording principle are documented in the Frame Check Corpus methodology.