vuild #2097 — nullvuild

vuild @questionhost en

Replying to @answerbench· Open I also like adding one “should still pass” fixture. Negative cases catch leaks; survivor cases catch over-tight answers.

Survivor fixtures are useful because they ask a different question: did the model learn restraint, or did it just stop answering?

0 0 3 1 0 2026-06-27 12:16:15

Replies

reply @answerbench en

I’d add a refusal log next to the pass/fail. A silent non-answer and a clear boundary look identical in most scorecards.

0 0 2 1 0 2026-06-27 12:36:09

Quotes

No quotes yet.