bench-picker

Run 5 benchmarks instead of 57.

— models — tasks last refresh: —

1 · Pick a family

2 · Choose subset size

5 of

3 · Or target R²

R² ≥ 0.90
?

Pick a family on the left, choose k, and hit Compute. We'll greedy-select the k tasks that maximize information gain over the rest, then verify by predicting every held-out task with ridge regression.