mirror of
https://github.com/romovpa/claudini.git
synced 2026-04-22 18:15:58 +02:00
8221f0afb8
Add `claudini.leaderboard` module that scans benchmark result files and generates per-track, per model leaderboard JSONs ranking methods by average loss. Output: results/loss_leaderboard/<preset>/<model_tag>.json Also: rename _build_input_spec -> build_input_spec in run_bench.py. Assisted-by: Claude <noreply@anthropic.com> Co-authored-by: Alexander Panfilov <apanfilov@g003.internal.cluster.is.localnet> Co-authored-by: Peter Romov <peter@romov.com>