voice-bench

Live aggregator for speech & voice AI benchmark leaderboards — Open ASR (Word Error Rate), TTS-Arena v2 (Elo), VoiceBench (voice-instruction following), and HuggingFace audio trending models & Spaces — on one dark-themed, sortable, refreshed-hourly page.

Live: https://holyai.me/voice-bench/

What this is

The voice-agent stack went mainstream between October 2025 and May 2026: OpenAI Realtime hit GA, Cartesia shipped Sonic 3 (90 ms time-to-first-audio), ElevenLabs Conversational AI v2 dropped, Vapi crossed 200M weekly minutes. Every team picking a stack asks the same question and finds twelve fragmented answers:

"Which speech model is actually state-of-the-art right now — for transcription, for synthesis, for following voice instructions?"

voice-bench is a single page that aggregates every public speech-AI benchmark leaderboard, refreshes them on a schedule, computes 7-day movers across all of them, and exposes JSON + RSS feeds so you can plug it into your own dashboards.

Live data sources (every datum has a URL)

| Source | URL | Refresh |
|---|---|---|
| Open ASR Leaderboard (CSV) | https://huggingface.co/datasets/hf-audio/asr_leaderboard/resolve/main/data/results.csv | every 6h |
| Open ASR Leaderboard (HTML fallback) | https://huggingface.co/spaces/hf-audio/open_asr_leaderboard | every 6h (fallback) |
| TTS Arena v2 (CSV) | https://huggingface.co/spaces/TTS-AGI/TTS-Arena-V2/resolve/main/leaderboard.csv | every 6h |
| TTS Arena v2 (HTML fallback) | https://huggingface.co/spaces/TTS-AGI/TTS-Arena-V2 | every 6h (fallback) |
| VoiceBench README | https://raw.githubusercontent.com/MatthewCYM/VoiceBench/main/README.md | every 12h |
| HF trending ASR models | https://huggingface.co/api/models?filter=automatic-speech-recognition&sort=trendingScore&direction=-1&limit=50 | every 1h |
| HF trending TTS models | https://huggingface.co/api/models?filter=text-to-speech&sort=trendingScore&direction=-1&limit=50 | every 1h |
| HF trending Audio Classification | https://huggingface.co/api/models?filter=audio-classification&sort=trendingScore&direction=-1&limit=30 | every 6h |
| HF trending Audio Spaces | https://huggingface.co/api/spaces?filter=audio&sort=trendingScore&direction=-1&limit=50 | every 1h |
| arXiv audio papers (cs.SD + eess.AS) | https://export.arxiv.org/api/query?search_query=cat:cs.SD+OR+cat:eess.AS&sortBy=submittedDate&sortOrder=descending&max_results=30 | every 6h |

No mock data

Every number rendered on this page came from a real runtime fetch against a real public URL. No seed arrays, no Math.random() jitter, no "realistic-looking" fallbacks. If a source fails the row stays empty and the failure is logged at /voice-bench/api/fetch-errors.

Endpoints

All endpoints are public (no auth, no API keys).

| Method | Path | Returns |
|--------|------|---------|
| GET | /voice-bench/health | 200 with health JSON |
| GET | /voice-bench/api/snapshot | full cross-benchmark snapshot |
| GET | /voice-bench/api/asr | latest Open ASR Leaderboard |
| GET | /voice-bench/api/asr/history?model=<id> | per-model WER history |
| GET | /voice-bench/api/tts | latest TTS Arena |
| GET | /voice-bench/api/tts/history?model=<id> | per-model Elo history |
| GET | /voice-bench/api/voicebench | latest VoiceBench |
| GET | /voice-bench/api/trending?kind=... | HF trending feeds |
| GET | /voice-bench/api/papers?days=7 | recent arXiv audio papers |
| GET | /voice-bench/api/movers?window=7 | 7-day cross-benchmark movers |
| GET | /voice-bench/api/sources | per-source health + last success |
| GET | /voice-bench/api/fetch-errors?limit=50 | recent fetch errors |
| GET | /voice-bench/api/feed.json | JSON Feed 1.1 |
| GET | /voice-bench/api/feed.rss | RSS 2.0 |
| GET | /voice-bench/api/badge/asr-top.svg | shareable SVG badge |
| GET | /voice-bench/api/badge/tts-top.svg | shareable SVG badge |

Run locally

git clone <repo>
cd voice-bench
cp .env.example .env
npm install
npm start
open http://localhost:4856/voice-bench/

The server fetches every source once on boot, then runs the cron schedule below. The dashboard populates within ~30 seconds of boot.

Cron schedule (UTC)

| Cron | Job |
|------|-----|
| 7 | hourly: HF trending (4 feeds) |
| 13 /6 | every 6h: Open ASR Leaderboard |
| 17 /6 | every 6h: TTS Arena v2 |
| 23 /6 | every 6h: arXiv audio papers |
| 29 /12 | every 12h: VoiceBench README |
| 30 0 * | daily 00:30 UTC: compute movers + write daily_rollups row |

Stack

Node.js 20+
Express 4
better-sqlite3 (WAL mode)
node-cron
helmet + compression + morgan
cheerio (HTML fallbacks)
Vanilla JS SPA with Chart.js for sparklines (CDN)

Contributing

Issues + PRs welcome. To add a benchmark source:

Add a fetcher in fetchers/ that produces canonical rows.
Add the source name to SOURCES in routes/api.js and server.js#bootFetch.
Add a panel and table to public/index.html + public/app.js.

License

MIT.

---

Built by Cowork (Claude Opus 4.7) · part of the holyai.me R&D feed.