Frame Check

Observatory snapshot

First observed run: 2026-04-12 07:11 UTC. Last observed run: 2026-04-28 11:39 UTC. 64 distinct runs in this window, 21 unique topics.

The Frame Check Observatory runs a fixed set of factual questions through multiple frontier AI models on a weekly or monthly cadence, records each response, extracts numerical claims, and verifies them against the Source Network. The extraction and verification protocol is documented in the methodology; the named frame patterns detected in each response are cataloged in the Frame Vocabulary Standard. All events are written to an open-data corpus licensed CC-BY-4.0. This page is a snapshot built from the corpus at site-build time; it is a slice, not a live dashboard.

Scope of this snapshot

This is the first committed snapshot. It covers a small window and a narrow model roster; the numbers are real but the N is small. The longitudinal claims Frame Check will eventually make (model-level framing drift over months, verification-rate trends across model generations) require far more cycles than this snapshot captures. Read this as a demonstration that the measurement pipeline is working end to end, not as a finding.

Per-topic breakdown

What this snapshot shows

Aggregates across all observed runs for each topic. "Voice" is the dominant voice classification for each provider on this topic; the rightmost column shows whether providers agreed. The "contradicted" column is verdicts where the Source Network found the model's value disagreed with an authoritative source, aggregated across providers. The "unresolved" column is verdicts where no source had the data. On narrow viewports the table scrolls horizontally so every column stays legible.

Per-topic observatory breakdown. One row per topic tracked in this window. Columns: topic name, class, cadence, number of runs, total claims extracted, verified count, contradicted count, unresolved count, voice agreement between observed providers.
Topic Class Cadence Runs Claims Matched Contradicted Unresolved Voice agreement
adult_human_bones_countevergreenmonthly1383323analytical (all)
apple_revenue_recentfinancialweekly25844131analytical (all)
atmospheric_co2_ppm_currentscientificmonthly2464411analytical (all)
bitcoin_all_time_high_usdfinancialmonthly1432743analytical (all)
carbon_12_atomic_massscientificmonthly1332800analytical (all)
china_population_currentevergreenweekly21035070analytical (all)
earth_sun_distance_au_metersscientificmonthly1332113mixed: gemini promotional, grok analytical
everest_height_metersscientificmonthly1474022analytical (all)
global_life_expectancy_currenthealthmonthly1251821analytical (all)
india_population_currentevergreenweekly31227365analytical (all)
israel_gdp_currentgeopoliticalweekly26440107mixed: gemini promotional, grok analytical
microsoft_revenue_recentfinancialweekly2645652analytical (all)
nvidia_revenue_recentfinancialweekly3695674mixed: gemini promotional, grok analytical
proton_rest_mass_kgscientificmonthly1643503analytical (all)
semaglutide_weight_losshealthweekly314810592analytical (all)
speed_of_light_vacuumscientificweekly315111343analytical (all)
taiwan_population_currentgeopoliticalmonthly1422103analytical (all)
tesla_2023_deliveriesfinancialmonthly2252352analytical (all)
ukraine_population_currentgeopoliticalweekly25423325mixed: gemini promotional, grok analytical
un_member_states_countevergreenmonthly1312230analytical (grok)
world_population_currentevergreenmonthly1644431analytical (all)

Taxonomy note. "Matched" combines verified (exact) and close (fuzzy / tolerance) matches, both successes against a known source. The Source Network also emits projection and disputed verdicts; this snapshot contains 47 such verdicts not broken out in the table above. The per-claim record, including all six verdict categories, is in the corpus event stream.

Models observed

Framing voice distribution

Voice classification of each model response, aggregated by provider. Analytical voice is expected on factual questions; promotional voice on a factual question is a signal worth attention.

ProviderVoiceResponses
geminianalytical34
geminipromotional6
grokanalytical35

Verification verdicts

Verdict distribution across numerical claims extracted from the model responses. "Verified" and "contradicted" require a source match; "unverifiable" means no authoritative source had the data; "projection" means the source network determined the claim to be an extrapolation rather than a fact.

VerdictClaims
verified902
contradicted87
unverifiable71
close44
projection43
disputed5

Scope regime distribution

The Layer 11 derivation-regime field (grounding.projection_regime) was added to observatory telemetry in clarethium_measure v1.5 (2026-04-17). No observatory events in this snapshot carry the field yet; it will populate as new cycles run against a v1.5+ stack. The segmentation matters because the Layer 11 primary P-signal is reliable in diagnostic-regime sources, noisy in transition, and effectively disabled in saturated: any longitudinal finding aggregating has_projection across regimes will under-count projection on number-dense sources. Surface rendered so the contract is visible before the data exists; the section fills in when data accrues.

Cost

Aggregate measured cost of the runs captured in this snapshot: $0.3952. Claims extracted: 1324. Contradicted: 87.

What this does not yet tell you

The snapshot regenerates whenever the corpus site is rebuilt on a machine with the event store present. Future snapshots will accrue longer time windows and more model coverage.