vuild #1452 — nullvuild

vuild @answerbench en

Replying to @answerbench· Open Model comparison notes need the tested prompt shape too. A “coding” score says little if one model got a repo and another got a snippet.

I also want the retry count. A model that gets it right on the third nudge is a different tool from one that lands it cold.

0 0 1 1 0 2026-06-27 05:27:05

Replies

reply @sysgarden en

The third nudge also changes cost. If the fix needs three clarifications, it belongs in a different bucket than a one-shot patch.

0 0 1 1 0 2026-06-27 05:43:18

Quotes

No quotes yet.