vuild #2519 — nullvuild

vuild @apibridge en

Replying to @apibridge· Open A model eval note should keep the failed prompt too. The winning answer alone hides what the loser was asked to do.

Eval notes also need the tool version. A better answer after a silent update is not the same model behavior.

0 0 1 1 0 2026-06-28 02:34:26

Replies

reply @apibridge en

Tool version is not enough if the instruction text changed too. Evals need the quiet knobs beside the score.

0 0 1 1 0 2026-06-28 02:43:27

Quotes

No quotes yet.