Observatory snapshot

First observed run: 2026-04-12 07:11 UTC. Last observed run: 2026-04-28 11:39 UTC. 64 distinct runs in this window, 21 unique topics.

The Frame Check Observatory runs a fixed set of factual questions through multiple frontier AI models on a weekly or monthly cadence, records each response, extracts numerical claims, and verifies them against the Source Network. The extraction and verification protocol is documented in the methodology; the named frame patterns detected in each response are cataloged in the Frame Vocabulary Standard. All events are written to an open-data corpus licensed CC-BY-4.0. This page is a snapshot built from the corpus at site-build time; it is a slice, not a live dashboard.

Scope of this snapshot

This is the first committed snapshot. It covers a small window and a narrow model roster; the numbers are real but the N is small. The longitudinal claims Frame Check will eventually make (model-level framing drift over months, verification-rate trends across model generations) require far more cycles than this snapshot captures. Read this as a demonstration that the measurement pipeline is working end to end, not as a finding.

Per-topic breakdown

What this snapshot shows

Aggregates across all observed runs for each topic. "Voice" is the dominant voice classification for each provider on this topic; the rightmost column shows whether providers agreed. The "contradicted" column is verdicts where the Source Network found the model's value disagreed with an authoritative source, aggregated across providers. The "unresolved" column is verdicts where no source had the data. On narrow viewports the table scrolls horizontally so every column stays legible.

Per-topic observatory breakdown. One row per topic tracked in this window. Columns: topic name, class, cadence, number of runs, total claims extracted, verified count, contradicted count, unresolved count, voice agreement between observed providers.
Topic	Class	Cadence	Runs	Claims	Matched	Contradicted	Unresolved	Voice agreement
adult_human_bones_count	evergreen	monthly	1	38	33	2	3	analytical (all)
apple_revenue_recent	financial	weekly	2	58	44	13	1	analytical (all)
atmospheric_co2_ppm_current	scientific	monthly	2	46	44	1	1	analytical (all)
bitcoin_all_time_high_usd	financial	monthly	1	43	27	4	3	analytical (all)
carbon_12_atomic_mass	scientific	monthly	1	33	28	0	0	analytical (all)
china_population_current	evergreen	weekly	2	103	50	7	0	analytical (all)
earth_sun_distance_au_meters	scientific	monthly	1	33	21	1	3	mixed: gemini promotional, grok analytical
everest_height_meters	scientific	monthly	1	47	40	2	2	analytical (all)
global_life_expectancy_current	health	monthly	1	25	18	2	1	analytical (all)
india_population_current	evergreen	weekly	3	122	73	6	5	analytical (all)
israel_gdp_current	geopolitical	weekly	2	64	40	10	7	mixed: gemini promotional, grok analytical
microsoft_revenue_recent	financial	weekly	2	64	56	5	2	analytical (all)
nvidia_revenue_recent	financial	weekly	3	69	56	7	4	mixed: gemini promotional, grok analytical
proton_rest_mass_kg	scientific	monthly	1	64	35	0	3	analytical (all)
semaglutide_weight_loss	health	weekly	3	148	105	9	2	analytical (all)
speed_of_light_vacuum	scientific	weekly	3	151	113	4	3	analytical (all)
taiwan_population_current	geopolitical	monthly	1	42	21	0	3	analytical (all)
tesla_2023_deliveries	financial	monthly	2	25	23	5	2	analytical (all)
ukraine_population_current	geopolitical	weekly	2	54	23	3	25	mixed: gemini promotional, grok analytical
un_member_states_count	evergreen	monthly	1	31	22	3	0	analytical (grok)
world_population_current	evergreen	monthly	1	64	44	3	1	analytical (all)

Taxonomy note. "Matched" combines verified (exact) and close (fuzzy / tolerance) matches, both successes against a known source. The Source Network also emits projection and disputed verdicts; this snapshot contains 47 such verdicts not broken out in the table above. The per-claim record, including all six verdict categories, is in the corpus event stream.

Models observed

Framing voice distribution

Voice classification of each model response, aggregated by provider. Analytical voice is expected on factual questions; promotional voice on a factual question is a signal worth attention.

Verification verdicts

Verdict distribution across numerical claims extracted from the model responses. "Verified" and "contradicted" require a source match; "unverifiable" means no authoritative source had the data; "projection" means the source network determined the claim to be an extrapolation rather than a fact.

Scope regime distribution

Provider	Voice	Responses
gemini	analytical	34
gemini	promotional	6
grok	analytical	35

Verdict	Claims
verified	902
contradicted	87
unverifiable	71
close	44
projection	43
disputed	5

The Layer 11 derivation-regime field (grounding.projection_regime) was added to observatory telemetry in clarethium_measure v1.5 (2026-04-17). No observatory events in this snapshot carry the field yet; it will populate as new cycles run against a v1.5+ stack. The segmentation matters because the Layer 11 primary P-signal is reliable in diagnostic-regime sources, noisy in transition, and effectively disabled in saturated: any longitudinal finding aggregating has_projection across regimes will under-count projection on number-dense sources. Surface rendered so the contract is visible before the data exists; the section fills in when data accrues.

Cost

Aggregate measured cost of the runs captured in this snapshot: $0.3952. Claims extracted: 1324. Contradicted: 87.

What this does not yet tell you

The snapshot regenerates whenever the corpus site is rebuilt on a machine with the event store present. Future snapshots will accrue longer time windows and more model coverage.