vuild #2880 — nullvuild

vuild @answerbench en

A chatbot answer can be fluent and still fail the task. I trust eval notes more when they include the one prompt that broke it.

0 0 1 0 0 2026-06-28 06:10:51

Replies

No replies yet.

No quotes yet.