endpoint-truth

Live, public trust board for LLM API endpoints. Ranks every provider serving every popular model by a transparent 0–100 trust score so you can see who actually serves a given model well today.

Why

By May 2026 there are 400+ LLMs and many of them are served by 5–20 different
providers through OpenRouter and other gateways. Two providers selling "the
same model" can deliver very different products: one runs the un-quantized
weights at full context with high uptime; the other quietly serves an FP4
distill at half the throughput with degraded quality. The visible price gap
is not enough to tell them apart.

endpoint-truth watches the OpenRouter public API and surfaces, for each
provider, the silent factors that matter — quantization disclosure, uptime
stability, price variance from median, throughput, and context fidelity.

Stack

Node.js 20+
Express
better-sqlite3 (WAL)
node-cron
helmet, compression
Vanilla JS SPA (no framework, no bundler)

Install & run

npm install
cp .env.example .env
npm start

Visit <http://localhost:4880/endpoint-truth/>.

Endpoints

All routes are public — no authentication. Mounted under BASE_PATH
(default /endpoint-truth).

HTML:

GET /endpoint-truth/ — dashboard (models index)
GET /endpoint-truth/m/:model_id — per-model trust ranking
GET /endpoint-truth/p/:provider — per-provider model list
GET /endpoint-truth/deltas — recent change feed
GET /endpoint-truth/about — rubric explanation

JSON (CORS open):

GET /endpoint-truth/api/stats
GET /endpoint-truth/api/models
GET /endpoint-truth/api/models/:id
GET /endpoint-truth/api/providers
GET /endpoint-truth/api/providers/:name
GET /endpoint-truth/api/deltas?since=ISO&limit=N
GET /endpoint-truth/api/series?model=…&provider=…&tag=…&window=86400

Ops:

GET /health and GET /healthz — {status:"ok", last_refresh, models_monitored}

Real data sources

endpoint-truth reads exclusively from the OpenRouter public API. Both URLs
are unauthenticated and queryable from any host.

| URL | Used for | Frequency |
|---|---|---|
| https://openrouter.ai/api/v1/models | Model catalogue + family selection | Every 30 min |
| https://openrouter.ai/api/v1/models/{id}/endpoints | Per-provider stats (uptime, latency, throughput, quantization, pricing) | Every 30 min, one request per monitored model |

We monitor up to MAX_MODELS_PER_CYCLE models (default 120), chosen from a
deterministic list of well-known model-family prefixes (anthropic/, openai/,
google/, meta-llama/, mistralai/, deepseek/, qwen/, x-ai/, etc.).
There is no hardcoded provider data, no seed array, no Math.random() in any
data path. If OpenRouter is unreachable, the refresh cycle records the error
and skips — it does not invent endpoints.

Trust score rubric

| Dimension | Max | Computation |
|---|---|---|
| Quantization honesty | 25 | null/unknown → 0; fp16/bf16 → 25; fp8 → 18; int8 → 14; fp4/awq/gptq → 8; int4 → 4. Penalises non-disclosure. |
| Uptime | 25 | uptime_last_1d × 0.6 + uptime_last_30m × 0.4, scaled to 25. |
| Price reasonability | 20 | Log-distance from the median prompt price across all providers for the same model. Within ±15% → 20. 2× away in either direction → 0. |
| Throughput | 15 | throughput_last_30m / median_throughput, capped at 1.0. |
| Context fidelity | 15 | provider_context_length / model_context_length. ≥95% → 15, ≤50% → 0. |

Total 100, banded: green ≥ 80, amber ≥ 60, red < 60.

Every snapshot stores the full component vector in trust_breakdown so the
score is auditable.

Environment

| Variable | Default | Purpose |
|---|---|---|
| PORT | 4880 | HTTP listen port |
| BASE_PATH | /endpoint-truth | Path prefix (matches nginx route) |
| DB_PATH | ./data/endpoint-truth.db | SQLite database |
| USER_AGENT | endpoint-truth/1.0 … | sent to OpenRouter |
| REFRESH_CRON | /30 | Full refresh schedule |
| DELTA_CRON | 7 | Hourly delta computation |
| PRUNE_CRON | 0 3 | Daily snapshot prune |
| MAX_MODELS_PER_CYCLE | 120 | Cap on models scraped per cycle |
| FETCH_THROTTLE_MS | 600 | Minimum gap between OpenRouter HTTP calls |
| SNAPSHOT_RETENTION_DAYS | 30 | Snapshot rolling window |
| OPENROUTER_API_KEY | — (vault) | Reserved for v2 (canary prompts) |

Operations

The schema is created idempotently on boot.
On boot, if no refresh has completed in the last 2 hours, a cold-start refresh
runs after 10 seconds.
The refresh cron is locked so a slow cycle never doubles up.
Hourly delta computation diff-checks the latest snapshot vs the earliest one
in the trailing 24h window and writes a row into deltas when something
notable changed.
Daily prune drops snapshots older than SNAPSHOT_RETENTION_DAYS.

License

UNLICENSED — internal Cowork R&D product.