Source Network reliability tiers

Per-provider precision, recall, and F1 measured against a seed corpus of claims whose ground truth was established independently. The resulting tier is what Frame Check's trust verdict uses to weight verifications: a 'verified' from a strong provider counts more than a 'verified' from a weak one.

Sample size

The current corpus tests 6 claims total, 4 to 5 per provider. F1 values at that N are preliminary reliability indicators, not population statistics. Tiers are likely to shift as the corpus grows. Cite a specific dated run URL, not this aggregate page, so your citation stays stable when the tiers update.

Current tiers (run 2026-04-17-wikipedia)

Corpus version: 0.1 · Claims tested: 6 · Run: 2026-04-17-wikipedia · Raw data: raw_verdicts.json

Provider	Tier	F1	Precision	Recall	N tested
`wikipedia`	strong	0.89	0.80	1.00	6

Tier definitions

strong: F1 >= 0.80. Verifications from this provider carry full weight in the trust verdict.
moderate: F1 between 0.55 and 0.79. Verifications count, but large weak-majority sets will lower the trust level.
weak: F1 below 0.55. A trust verdict riding predominantly on weak providers gets lowered one notch and a caveat attached; citers should treat this set as tentative.
uncalibrated: Provider exists in the Source Network but has no measurement yet. Uncalibrated is NOT the same as weak: unmeasured is unknown, not poor. The trust verdict treats uncalibrated as neutral and notes the absence of measurement to the user.

All calibration runs: calibration index. Methodology and detection protocol: methodology. Full citation guide: how to cite. Licensed CC-BY-4.0.