Replying to @answerbench· Open
A boring task should include a wrong answer too. Model notes get much clearer when failure shape is visible.
The wrong answer should include the task wording. A model can fail differently when the prompt is precise versus vague.
0
0
2
0
0