← back to gallery

Local Rank

Live leaderboard of open-source on-device LLM serving runtimes, refreshed from GitHub.

dev-toolsllmlocal-firston-deviceleaderboardgithubruntimesollamallama.cppvllmmlx
Open product ↗

local-rank

Live leaderboard of open-source on-device LLM serving runtimes.

local-rank tracks the engines that turn quantized model files into
tokens on your own hardware — Ollama, llama.cpp, vLLM, mlx-lm, LocalAI,
Jan, llamafile, KoboldCPP, GPT4All, mistral.rs, MLC LLM, TGI, exo,
RamaLama, and SGLang. Hourly refresh from the GitHub REST API.

Demo: <https://holyai.me/local-rank/>

Why

Six of the ten trending GitHub repositories this week declare an on-device
or local-first architecture, but no public dashboard tracks the velocity
of this segment in isolation. Existing leaderboards either lump local and
hosted inference together or focus on agent frameworks.

local-rank answers "which on-device runtime is actually winning this
month?" without benchmarks, without opinions, and without a login.

Data sources

Every cell on the leaderboard comes from one of the four GitHub REST
endpoints below. No values are hardcoded, mocked, or generated.

| Source | Endpoint | Cadence |
|---|---|---|
| Repo metadata (stars, forks, watchers, issues, license, pushed_at) | GET /repos/{owner}/{repo} | hourly (4h without token) |
| Releases (latest tag + 30-day count) | GET /repos/{owner}/{repo}/releases?per_page=100 | hourly (4h without token) |
| Commits in last 30 days | GET /repos/{owner}/{repo}/commits?per_page=1&since={iso} | hourly (4h without token) |
| Contributor count | GET /repos/{owner}/{repo}/contributors?per_page=1&anon=true | hourly (4h without token) |

If a runtime's repo lookup fails the runtime is logged to fetch_log and
skipped — never stubbed.

API

All endpoints prefixed with BASE_PATH (default /local-rank). No auth.

| Method | Path | Description |
|---|---|---|
| GET | /local-rank/health | { ok, ts, last_refresh, runtimes }. Returns 200 even with empty DB. |
| GET | /local-rank/api/runtimes | Leaderboard. Filters: ?engine=, ?apple_silicon=1, ?cuda=1, ?rocm=1, ?openai_compatible=1, ?q=, ?sort=velocity\|stars\|stars_7d\|stars_30d\|releases_30d\|commits_30d\|last_release\|last_commit. |
| GET | /local-rank/api/runtimes/:slug | Full runtime row + capability matrix. |
| GET | /local-rank/api/history/:slug?days=30 | Snapshot series for sparklines. |
| GET | /local-rank/api/stats | Totals, top movers, breakdown by engine. |
| POST | /local-rank/api/refresh | Trigger an out-of-band refresh. Returns 202 if one is already running. |
| GET | /local-rank/api/refresh/status | { running, last_fetch }. |

Scoring

Composite velocity score on a 0–100 scale, recomputed every refresh. All
inputs are normalized across the current catalog (min-max), then weighted:

| Weight | Input |
|---|---|
| 0.30 | star delta 7d / current stars (relative momentum) |
| 0.20 | star delta 30d / current stars |
| 0.15 | commits in last 30 days |
| 0.10 | releases in last 30 days |
| 0.10 | recency of last release (linear decay over 90 days) |
| 0.10 | recency of last commit (linear decay over 30 days) |
| 0.05 | log(contributors) |

Tiny / new repos get a floor of 5 so they remain visible. Implementation
lives in lib/scoring.js.

Local development

cp .env.example .env
# optional but recommended: set GITHUB_TOKEN to a personal access token
# with public_repo scope so the cron can run hourly instead of every 4h
npm install
npm start
# open http://127.0.0.1:4902/local-rank/

Run a one-shot refresh outside of the cron:

npm run refresh

Environment

| Variable | Default | Notes |
|---|---|---|
| PORT | 4902 | HTTP port |
| NODE_ENV | production | |
| GITHUB_TOKEN | __INJECT_FROM_VAULT__ | Personal access token, public_repo scope is enough. Required for hourly cadence. |
| BASE_PATH | /local-rank | URL prefix, must match the reverse-proxy mount |
| DB_PATH | ./local-rank.db | SQLite path |

Adding a runtime

Append a new entry to lib/catalog.js. Use the real GitHub coordinates
(github_owner, github_repo); the next cron tick will populate every
metric. Capability flags (engine, apple_silicon, cuda, …) are static
metadata derived from the project's README.

What this is not

License

MIT.