Frame Check

Uncertainty Frame

CalibrationCounterfactual

Identification

Uncertainty Frame fires when the document structurally organizes its analysis around what is unknown, contested, or assumption-dependent. The frame must be the ORGANIZING PRINCIPLE of the analysis, not a surface feature.

The uncertainty frame asks: what do we not know? Most AI-generated analysis presents conclusions with high confidence even when the underlying evidence is thin, assumptions are untested, or expert consensus does not exist. Documents that surface the gap between confidence of presentation and confidence of evidence exhibit this frame.

The frame fires when the document:

The frame does NOT fire when:

What this frame makes visible:

What this frame makes invisible:

Positive examples: A climate science summary that presents temperature projections as ranges (1.5-4.5C by 2100) with named sources of uncertainty (climate sensitivity, emissions pathway, feedback loops) as an organizing structure. Each projection carries its evidence quality. Uncertainty IS the frame.

Negative examples: An AI-generated market analysis that says "the AI market will reach $500 billion by 2028" without any qualifier, source, confidence interval, or acknowledgment. A document with occasional "perhaps" and "may" hedges in otherwise confident prose does NOT fire; uncertainty language alone isn't the frame, structural organization around uncertainty is.

Adjacent frames: Risk Frame (FVS-009, addresses what could go wrong, while uncertainty addresses what is unknown), Failure Framing (FVS-007, specifies what would make claims wrong, while uncertainty names what cannot yet be known; uncertainty can exist without explicit failure criteria, the inverse is not symmetric), Completeness Illusion (FVS-010, uncertainty dimension may be mentioned briefly without analysis), False Balance (FVS-017, false balance manufactures artificial uncertainty by elevating minority positions to majority-evidence levels; what reads as genuine uncertainty in the document may be a false-balance artifact), Authority by Citation (FVS-016, authority-by-citation strips uncertainty markers from citations; genuine citations include uncertainty as a signal of epistemic care), Temporal Anchoring (FVS-014, future-projected content gives uncertainty cover; temporal anchoring's future orientation can hide the epistemic uncertainty that uncertainty framing surfaces)

When this frame is appropriate: Scientific analysis, investment decisions, policy assessment, any context where the reader needs to distinguish between what is known with confidence and what is estimated, projected, or contested.

When this frame is misleading: Stable factual domains where uncertainty is negligible (the speed of light, the population of France, the boiling point of water). Applying uncertainty framing to well-established facts produces false balance. Also misleading when used to delay action on claims that are sufficiently certain for practical purposes.

Honest limits: The detection heuristic (presence of uncertainty-dimension markers) catches explicit uncertainty language (hedging, ranges, "may," "approximately," "estimated") but misses cases where uncertainty is high but the document presents false precision. A claim like "$500 billion by 2028" has enormous uncertainty but uses no uncertainty language. Under the revised (Phase 1C) definition, hedging language alone does NOT fire the frame; structural organization does. The detector can identify surface uncertainty language; whether the document's STRUCTURE is organized around uncertainty remains an interpretive judgment.

Revision note (2026-04-23, Phase 1C): Revised from v1 to require structural organization around uncertainty as the primary analytical frame, not mere surface hedging. v1 permitted narrow and broad readings that produced low cross-family agreement (v2 mean AC1 0.359). The revised definition tightens toward the narrow reading, excluding cases where uncertainty language appears but does not organize the analysis. Predicted cross-family Gwet's AC1 lift: 0.359 → approximately 0.55-0.65.

Decision-readiness implication

Direct readiness implication.

When this frame fires, the document explicitly names what is unknown, contested, or assumption-dependent. Affects:

Absence of this frame in contexts where uncertainty is real is a structural overconfidence signal.

Generation affordances

Rewrite prompt structure: "For each projection, estimate, or forward-looking claim in this document, add an uncertainty annotation: what is the evidence quality (measured, estimated, projected, speculated)? What is the range of plausible values? What assumptions does this depend on? What do experts disagree about?"

Counter-document prompt: "This document presents its conclusions with high confidence. Rewrite with honest uncertainty: for each point estimate, provide a range. For each projection, name the assumptions. For each 'experts say,' name the disagreements. The goal is not to undermine the analysis but to make the reader aware of where the floor might give way."

Salient questions under this frame:

Worked example

Document excerpt: "Global semiconductor revenue will exceed $1 trillion by 2030. Artificial intelligence will drive 40% of this growth, with data center chips accounting for the largest share."

Frame present: Confident projection. "$1 trillion by 2030" and "40% of this growth" are presented as facts.

Frame absent: Any uncertainty signal. Questions not addressed: whose projection? What is the confidence interval? ($800B to $1.2T? $600B to $1.5T?) What are the assumptions about AI adoption rates? What happens if there is a recession, a trade war, or a technology plateau? What was the accuracy of similar projections made 5 years ago?

How to read past it: For each number, ask: "is this a measurement or a guess?" $1 trillion by 2030 is a guess (projection). 40% AI-driven is a guess within a guess. Neither is wrong per se, but presenting them without uncertainty framing implies a precision that does not exist.

Branch applicability

Primary branch: A (document analysis)

Branch A: Detected via coverage analysis. Presence of uncertainty markers indicates the document acknowledges its own limits. ABSENCE of uncertainty markers in a document that makes forward-looking claims or uses point estimates is the actionable signal.

Branch B: The user can apply the uncertainty frame in pre-commit: "What am I uncertain about in my own assessment?" before seeing AI's confident answer. The pre-commit makes the user's own uncertainty visible as a comparison point.

Vocabulary connections

Cross-family reliability

Engine-canonical reading (library_v4 ratified 2026-04-24). library_v4 Identification sections are byte-equivalent to library_v3 per fvs_eval/v4_2/LIBRARY_V3_TO_V4_RATIFICATION_v1.md. The V4.2 engine reads only the Identification section per `v4_2_engine.py::_extract_identification`, so cross-family AC1 on library_v4 equals cross-family AC1 on library_v3 by judge-visible byte-equivalence. The library_v3 row in the 'Engine-canonical (library_v3 = library_v4 by Identification byte-equivalence)' subsection above carries the engine-canonical reliability values for this frame. The 'V4.2 NEW panel measurement against library_current' subsection below documents the working-library measurement immediately prior to ratification, retained as historical pre-ratification context.

Engine-emit disclosure. `library_consensus_ac1` = 0.628 (tier: moderate), per fvs_eval/v4/library_v4_reliability.json. Per-corpus reproducible values (regen: fvs_eval/v4/compute_per_corpus_reliability.py; artifact: fvs_eval/v4/library_v4_per_corpus_reliability.json): MG_v3=0.604 (clean library_v4 via Identification byte-equivalence), MG2_v4=0.728 (3-family partial; Anthropic queued). Historical: MG2_v1=0.263 (library_v1), MG2_v2=0.605 (library_v2). Note: ac1_avg is NOT reproducible from these via simple or weighted averaging per fvs_eval/v4_2/RELIABILITY_ARTIFACT_REPRODUCIBILITY_AUDIT_v1.md; rebuild queued for library_v5.

Intra-rater stability (Grok 4.1 fast). `detector_intra_rater_ac1` = 0.917 across n=41 docs at temp=0 (2 verdict flip(s); per fvs_eval/v4/grok_intra_rater_ac1.json). Measures single-family consistency, independent of cross-family AC1: low cross-family + high intra-rater is possible (and common).

Construct-validity caveat. `library_consensus_ac1` measures cross-family LLM agreement, NOT agreement with human reader labels. Per METHODOLOGY.md section 1.3, V1 detector macro-F1 against human labelers was 0.157 (chance-level, n=12); library_v4 LLM-judge has not been re-validated against humans. Read AC1 as inter-LLM consensus proxy, not human-validated reliability.

Engine-canonical (library_v3 = library_v4 by Identification byte-equivalence) and earlier variants

See fvs_eval/v4_2/LIBRARY_CROSS_FAMILY_BASELINE_v1.md §3 for library-wide tier context and fvs_eval/v4_2/CONSTRUCT_VALIDITY_AUDIT_v1.md §3 for reasoning-coherence profile.

V4.2 NEW panel measurement against library_current (2026-04-24, historical pre-ratification)

V4.2 NEW panel (2026-04-24 measurement): Claude Haiku 4.5, Gemini 3.1 flash lite, Grok 4.1 fast (V4.2 canonical), GPT-5.4 mini. Corpus: fvs_eval/mixed_genre_v1 n=15. Library reference: the working library state at `data/frame_library/` immediately prior to library_v4 ratification (2026-04-24). This subsection's numbers are historical pre-ratification context. Engine-canonical numbers under library_v4 are in the 'Engine-canonical (library_v3 = library_v4 by Identification byte-equivalence) and earlier variants' subsection above (library_v3 row), per the byte-equivalence statement at the top of this Cross-family section.

Metric Value
Gwet's AC1 (pairwise mean) 0.298
Cohen's kappa (pairwise mean) 0.305
Raw agreement (pairwise mean) 0.644
Union prevalence 13/15 = 87%
Intersection (all 4 agree positive) 3/15

Per-family positives (of 15 docs): Claude 8, Gemini 6, Grok 6, GPT 10.

Apply this frame to your text

Paste a paragraph and see whether FVS-012 (Uncertainty Frame) fires structurally. Pure pattern detection: no LLM, no judgment, the same code the full analyzer runs.