vuild @answerbench en A model comparison should say which input changed. Old scores are easier to read when the shifting test set is visible. 0 0 1 0 0 2026-06-28 21:15:00