Phase 2 validation: reviewers wanted
The decision-readiness profile (status:
experimental) ships methodology + computation
in JSON-only mode pending Phase 2 expert validation. Without
external raters, the profile stays experimental indefinitely
and the framework cannot be cited as validated. This page is
the recruitment surface.
What you would be doing
Reading documents from a curated corpus and scoring each on five
structural dimensions of decision support, on a 1-5 scale, with
notes per dimension explaining your reasoning. Each rating is
saved as one YAML file. The scores feed an aggregate Spearman
correlation against Frame Check's computed signals; the
divergence-with-notes pattern is the most informative result.
The dimensions and the operational definitions for the 1-5 anchors
are documented in the
methodology page
(framework + thresholds) and the
rater guide
(operational rating procedure). Both readable here on corpus_site;
the rater guide's source artifact is at
validation/decision_readiness/rater_guide.md
on GitHub.
Preview the corpus before committing. Browse
the current
validation corpus
(10 entries across AI responses + financial analysis) to see
exactly what documents you would be rating. Each entry shows its
document text, current Frame Check profile, and any diff/peer
comparisons with other entries. You can preview what rating
against these looks like without committing.
What the work looks like in practice
- Time per document: 30-60 minutes.
- Batch size: 5-10 documents per rater (3-6 hours
total commitment, concentrated as the rater chooses).
- Genre focus: ratings are filtered to the rater's
declared expertise. The corpus currently spans AI responses,
financial analysis, and policy briefs; v2 expansion adds journalism
and academic essays.
- Deliverable shape: one YAML file per document
(fields: doc_id, rater_id, per-dimension 1-5 score, per-dimension
notes). Before submitting your first rating, read the
rating-quality
contrast showing good / mediocre / insufficient examples for the
same document; the contrast is the calibration. Source files at
examples/example-good.yaml.
- Submission: GitHub pull request adding your
YAML files to
validation/decision_readiness/ratings/.
- Blinding: ratings must be made BEFORE reading
Frame Check's profile for the document. Blinding is the basis of
the correlation result; without it the result is uninformative.
Compensation
The Phase 2 effort uses three compensation options scaled to the
rating commitment shape. The full menu is documented in
RATERS.md;
the headline summary:
- Volunteer with named attribution. Zero cash,
named in the corpus's rater registry, cited in the v1 validation
paper as a contributing rater. CC-BY-4.0 publication of the
ratings.
- Per-batch honorarium. Flat fee per batch of
5-10 documents, $200-$500 range. Signals the project takes the
rater's time seriously.
- Co-authorship on v1 validation paper. Raters
who substantively shape the validation work (20+ ratings, per-genre
analysis, methodology-revising findings) named as co-authors on
the paper.
The curator's recommendation is hybrid (honorarium + paper
acknowledgment) for first-wave raters; volunteer with named
attribution for subsequent raters once the validation paper is
published. Final recommendation depends on prospective-rater
feedback; the curator decides per engagement.
How to engage
- Read the
methodology page
to confirm the framework matches what you are willing to rate
against.
- Read
RATERS.md
in full for the contract (terms, attribution, IP, blinding).
- Read the
rater guide
for the operational procedure.
- Open a GitHub issue at
github.com/lluvr/frame-check/issues
with the title prefix
[phase-2-rating] and a brief
expression of interest (your name, genre expertise, batch size,
preferred compensation option). The curator responds within one
week.
- Rate one document first as a workflow validation. Curator
provides feedback within one week. Once settled, batch follows.
- Submit ratings as a GitHub PR adding YAML files to
validation/decision_readiness/ratings/.
What changes after Phase 2 ships
When Phase 2 reaches the per-dimension Spearman thresholds
documented on the methodology page (>=0.6 averaged across genres,
no individual genre below 0.4), the profile transitions from
experimental to validated. At that point:
- The decision-readiness profile becomes a live signal in the
product UI surface (currently JSON-only per the methodology page's
UI gate; the gate exists explicitly to avoid shipping an
unvalidated signal to casual users).
- The validation paper is submitted with the rater registry as
the contributor list.
- The rater registry becomes a permanent citable record.
- v2 corpus expansion (60-100 documents) opens for tighter
per-genre confidence intervals; raters joining post-validation
contribute to v2.
What this is not
- Not a job. Research collaboration with named
attribution, optionally compensated. Not employment.
- Not a free certification of Frame Check. A
rater's participation does not endorse the profile. The
validation result publishes whether the dimensions correlated
with expert judgment; that result is the endorsement (or the
retraction).
- Not a binding judgment on AI quality. Frame
Check rates structural signals in documents (which may or may
not be AI output). The rater is not endorsing or rejecting any
specific AI provider.
Honest limits of this invitation
RATERS.md is a v0 contract. It has not been reviewed by external
raters. The compensation options are the curator's proposal; the
actual recommendation depends on what prospective raters say is
fair. If you read this and find a term confusing or inadequate,
telling the curator is the highest-value contribution before any
rating begins.
Library version: 0.2.0.
Licensed
CC-BY-4.0.