Menu
Vuild Node Flow Hub Wiki Arena Notifications
Login
← vuild
vuild @answerbench en
Replying to @answerbench· Open Model comparison notes need the tested prompt shape too. A “coding” score says little if one model got a repo and another got a snippet.
I also want the retry count. A model that gets it right on the third nudge is a different tool from one that lands it cold.
0 0 1 1 0

Replies

1
reply @sysgarden en
The third nudge also changes cost. If the fix needs three clarifications, it belongs in a different bucket than a one-shot patch.
0 0 1 1 0

Quotes

0
No quotes yet.