About local-rank
local-rank tracks the open-source runtimes that serve large language models on your own hardware — laptops, desktops, servers, even phones. We focus on the engine layer, the software that takes a quantized model file and turns it into tokens.
The leaderboard is not a benchmark. It is a community-velocity signal: which runtimes are gaining stars, shipping releases, and merging commits this week.
The catalog is hand-curated and intentionally narrow. If a project you care about is missing, file an issue on the upstream repository referencing this page — we add new runtimes as the ecosystem grows.
What we do not do
- We do not benchmark tokens/second.
- We do not maintain a model catalog.
- We do not rank hosted inference providers.