Phase 2 validation: reviewers wanted

The decision-readiness profile (status: experimental) ships methodology + computation in JSON-only mode pending Phase 2 expert validation. Without external raters, the profile stays experimental indefinitely and the framework cannot be cited as validated. This page is the recruitment surface.

What you would be doing

Reading documents from a curated corpus and scoring each on five structural dimensions of decision support, on a 1-5 scale, with notes per dimension explaining your reasoning. Each rating is saved as one YAML file. The scores feed an aggregate Spearman correlation against Frame Check's computed signals; the divergence-with-notes pattern is the most informative result.

The dimensions and the operational definitions for the 1-5 anchors are documented in the methodology page (framework + thresholds) and the rater guide (operational rating procedure). Both readable here on corpus_site; the rater guide's source artifact is at validation/decision_readiness/rater_guide.md on GitHub.

Preview the corpus before committing. Browse the current validation corpus (10 entries across AI responses + financial analysis) to see exactly what documents you would be rating. Each entry shows its document text, current Frame Check profile, and any diff/peer comparisons with other entries. You can preview what rating against these looks like without committing.

What the work looks like in practice

Time per document: 30-60 minutes.
Batch size: 5-10 documents per rater (3-6 hours total commitment, concentrated as the rater chooses).
Genre focus: ratings are filtered to the rater's declared expertise. The corpus currently spans AI responses, financial analysis, and policy briefs; v2 expansion adds journalism and academic essays.
Deliverable shape: one YAML file per document (fields: doc_id, rater_id, per-dimension 1-5 score, per-dimension notes). Before submitting your first rating, read the rating-quality contrast showing good / mediocre / insufficient examples for the same document; the contrast is the calibration. Source files at examples/example-good.yaml.
Submission: GitHub pull request adding your YAML files to validation/decision_readiness/ratings/.
Blinding: ratings must be made BEFORE reading Frame Check's profile for the document. Blinding is the basis of the correlation result; without it the result is uninformative.

Compensation

The Phase 2 effort uses three compensation options scaled to the rating commitment shape. The full menu is documented in RATERS.md; the headline summary:

Volunteer with named attribution. Zero cash, named in the corpus's rater registry, cited in the v1 validation paper as a contributing rater. CC-BY-4.0 publication of the ratings.
Per-batch honorarium. Flat fee per batch of 5-10 documents, $200-$500 range. Signals the project takes the rater's time seriously.
Co-authorship on v1 validation paper. Raters who substantively shape the validation work (20+ ratings, per-genre analysis, methodology-revising findings) named as co-authors on the paper.

The curator's recommendation is hybrid (honorarium + paper acknowledgment) for first-wave raters; volunteer with named attribution for subsequent raters once the validation paper is published. Final recommendation depends on prospective-rater feedback; the curator decides per engagement.

How to engage

Read the methodology page to confirm the framework matches what you are willing to rate against.
Read RATERS.md in full for the contract (terms, attribution, IP, blinding).
Read the rater guide for the operational procedure.
Open a GitHub issue at github.com/lluvr/frame-check/issues with the title prefix [phase-2-rating] and a brief expression of interest (your name, genre expertise, batch size, preferred compensation option). The curator responds within one week.
Rate one document first as a workflow validation. Curator provides feedback within one week. Once settled, batch follows.
Submit ratings as a GitHub PR adding YAML files to validation/decision_readiness/ratings/.

What changes after Phase 2 ships

When Phase 2 reaches the per-dimension Spearman thresholds documented on the methodology page (>=0.6 averaged across genres, no individual genre below 0.4), the profile transitions from experimental to validated. At that point:

The decision-readiness profile becomes a live signal in the product UI surface (currently JSON-only per the methodology page's UI gate; the gate exists explicitly to avoid shipping an unvalidated signal to casual users).
The validation paper is submitted with the rater registry as the contributor list.
The rater registry becomes a permanent citable record.
v2 corpus expansion (60-100 documents) opens for tighter per-genre confidence intervals; raters joining post-validation contribute to v2.

What this is not

Not a job. Research collaboration with named attribution, optionally compensated. Not employment.
Not a free certification of Frame Check. A rater's participation does not endorse the profile. The validation result publishes whether the dimensions correlated with expert judgment; that result is the endorsement (or the retraction).
Not a binding judgment on AI quality. Frame Check rates structural signals in documents (which may or may not be AI output). The rater is not endorsing or rejecting any specific AI provider.

Honest limits of this invitation

RATERS.md is a v0 contract. It has not been reviewed by external raters. The compensation options are the curator's proposal; the actual recommendation depends on what prospective raters say is fair. If you read this and find a term confusing or inadequate, telling the curator is the highest-value contribution before any rating begins.

Library version: 0.2.0. Licensed CC-BY-4.0.