This module is being built
Module 1 is the fully-developed reference. This page captures everything needed to expand Module 05 next: its scenario, learning objectives, and source material.
Planned learning objectives
- Audit an eval for task-design, harness, metrics-hygiene, and grader-bias gotchas using a Claude Code skill.
- Wrap any eval in a model × thinking × effort grid.
- Instrument per-cell pass rate, cost, and latency.
- Produce comparison plots and a one-sentence model recommendation.
Source material
This module adapts cwc-workshops/rightmodel (Apache-2.0). The build will follow the standard module template: scenario, objectives, setup, walkthrough with checkpoints, assessment, and stretch goals — the same shape as Module 1.