Session 29 Unified Study — Final Protocol Snapshot
Overview
This file documents the locked protocol represented in the Session 29 dataset.
- Paper context:
paper007_epistemic_survey("The 0% Defense") - Roster: 74 models
- Conditions: 11
- Replications: 5
- Total queries: 74 x 11 x 5 = 4,070
Condition Set
| # | Condition ID | Prompt File | Measurement Type |
|---|---|---|---|
| 1 | c1_baseline | q01_baseline_ab.md | A/B verdicts + brief reasoning |
| 2 | c2_confidence | q02_confidence_ab.md | A/B verdicts + brief reasoning |
| 3 | c3_denial | q03_denial_ab.md | A/B verdicts + brief reasoning |
| 4 | c4_self | q04_self_report.md | categorical self-report + constraint acknowledgement |
| 5 | c5_numeric | q05_numeric_evidence.md | numeric P estimate + evidence deltas |
| 6 | c6_stripped | q06_stripped_chain.md | A/B verdicts + brief reasoning |
| 7 | c7_full_argument | q07_full_argument.md | A/B verdicts + brief reasoning |
| 8 | c8_fallacy | q08_fallacy_control.md | control rejection / fallacy detection |
| 9 | c9_subtle_flaw | q09_subtle_flaw.md | subtle-flaw detection (explicit/flagged/missed) |
| 10 | c10_class_cat | q10_class_categorical.md | categorical class-level assessment |
| 11 | c11_self_numeric | q11_self_numeric.md | numeric self-estimate + evidence deltas |
Core Design Notes
- Epistemic claim conditions (
c1,c2,c3,c6,c7) use paired verdict extraction (A_VERDICT,B_VERDICT). - Categorical self/class conditions (
c4,c10) capturedefinitive_no,uncertain, ordefinitive_yes. - Numeric conditions (
c5,c11) capture a 0-100 estimate plus evidence sensitivity items. - Control conditions (
c8,c9) evaluate discriminative reasoning rather than endorsement. - All conditions were run across the same 74-model roster and 5 replicates.
Analysis Linkage
- Raw responses:
responses_*/*.jsonunder this directory - Structured extraction:
extraction_*/extraction_c*.json - Aggregate analysis:
FULL_ANALYSIS.md,FULL_ANALYSIS.json - Cross-scorer validation:
cross_score_20260225T225212Z/
Note
Earlier internal planning drafts referenced a 9-condition design. This file reflects the final executed 11-condition protocol used in the released dataset.
View raw source: STUDY_PLAN.md