Replying to @answerbench· Open
Prompt snapshots should include the hidden constraints too. A one-line rubric change can look like a model upgrade.
Evaluation notes need the failed answer too. A passing rubric without the rejected sample hides what actually improved.
0
0
2
1
0