embed-arena
Live MTEB embedding-model leaderboard with cost intelligence. Pick the best embedding for your RAG budget in 30 seconds.
embed-arena pulls every embedding model with public MTEB results from
HuggingFace, computes per-category mean scores, joins with a curated provider-pricing
table, and serves a single-page leaderboard you can filter by license, dimension,
modality, and budget.
- Live data, no mocks. Benchmark scores fetched from
huggingface.co/datasets/mteb/results. - Cost intelligence. Each closed-source row carries a documented price + source URL + verification date.
- Self-host friendly. Single Node.js process, single SQLite file, no external dependencies at runtime.
- Public. No login, no signup, no API key.
Quickstart
git clone <repo>
cd embed-arena
cp .env.example .env
npm install
npm start
# open http://localhost:4751/embed-arena/
The first boot performs a small warm-up (~30 models, ~40 MTEB tasks each).
Subsequent boots use the cached SQLite. Cron refreshes run every few hours.
Endpoints
All endpoints are public and unauthenticated.
| Endpoint | Method | Description |
| --- | --- | --- |
| /embed-arena/health | GET | Service status + DB row counts |
| /embed-arena/api/models | GET | Filterable, sortable model list |
| /embed-arena/api/models/:id | GET | Per-model details + per-task scores |
| /embed-arena/api/categories | GET | List of category aggregates |
| /embed-arena/api/stats | GET | Open-vs-closed gap, top model, cheapest, best value |
| /embed-arena/api/refresh-log | GET | Recent refresh activity |
| /embed-arena/api/refresh | POST | Trigger an MTEB refresh (rate-limited per IP, no auth) |
| /embed-arena/api/refresh/:jobId | GET | Poll a refresh job |
| /embed-arena/api/sources | GET | List of every external data source URL we hit |
| /embed-arena/api/admin/sync-providers | POST | Re-sync data/providers.json into the DB |
| /embed-arena/api/export.csv?category=overall_eng | GET | Download current ranking as CSV |
Query params for /api/models
category—overall_eng,overall_multi,retrieval,classification,sts,clustering,reranking,code,summarizationlicense—openorclosedmin_dim,max_dim— vector dimension boundsmax_price— USD per million input tokensmin_score— minimum mean MTEB score (0–100 normalized)modality—text,image,code,audio,videosearch— substring match on id / display_namesort—score(default),value,cheapest,downloads,recent,namelimit— max rows, default 100, cap 500
Data sources
| Source | Purpose | Refresh |
| --- | --- | --- |
| https://huggingface.co/api/datasets/mteb/results/tree/main/results | Discovery: every model with public MTEB evals | Daily 03:00 UTC |
| https://huggingface.co/datasets/mteb/results/resolve/main/results/{model}/{rev}/{task}.json | Real benchmark main_score per task | Daily 05:00 UTC |
| https://huggingface.co/api/models/{owner}/{model} | Live downloads, license, dimensions, parameter count | Every 6 hours |
| data/providers.json (file) | Documented proprietary pricing (OpenAI, Cohere, Voyage, Google, Mistral, Amazon, Jina, NVIDIA) | Weekly Monday 04:00 UTC |
Pricing is never fabricated: each row in data/providers.json carries apricing_source_url pointing at the provider's published docs and alast_verified date. The UI links to the source on every model detail card.
The full source list is also exposed at /embed-arena/api/sources.
Schema
See db.js for the full DDL. The interesting tables:
models— one row per embedding model, joins HF metadata + provider pricingbenchmark_scores— per-task normalized scores from MTEBcategory_aggregates— per-model means by category (recomputed after each score refresh)refresh_log,refresh_jobs— refresh history
How "Value Score" is computed
For models with a known price-per-million-tokens, value_score = mean_overall_eng_score / price_per_million_tokens_usd.
Higher is better (quality per dollar). Self-hosted models display "—" for value
because their cost depends on your hardware, not on a per-token rate.
Adding a new closed-source model
- Append to
data/providers.jsonwithid,provider,price_per_million_tokens_usd,pricing_source_url,last_verified, plus dimensions and modalities if known. curl -X POST http://localhost:4751/embed-arena/api/admin/sync-providers
License
MIT. Built with Express, better-sqlite3, node-cron, helmet, compression.
Built by holyai.me — part of the Cowork R&D pipeline.