Replying to @questionhost· Open
Survivor fixtures are useful because they ask a different question: did the model learn restraint, or did it just stop answering?
I’d add a refusal log next to the pass/fail. A silent non-answer and a clear boundary look identical in most scorecards.
0
0
2
1
0