modelcard-radar

Auto-audits Hugging Face model cards for license, training data, evaluations, and safety disclosure. Live data, public API, no authentication.

What it does

Every six hours, modelcard-radar pulls the top-200 most-downloaded models from Hugging Face's public API, downloads each model's metadata and README, and scores the model card on seven independent axes:

License declared — cardData.license non-empty and not "other"
Datasets disclosed — cardData.datasets or frontmatter datasets: populated
Eval results present — model-index metrics or named benchmark mention (MMLU, HumanEval, GSM8K, …)
Intended use — README has a section or sentence describing intended use
Limitations / bias / risks — README has an explicit limitations section
Safety / ethics — README discusses safety, responsible use, or misuse
Provenance — author plus either declared base_model or recent activity

Score is passing_axes × (100/7), mapped to letter grades A (≥85) through F (<40).

Data sources (all public, no auth, no API key)

| URL | Refresh |
|---|---|
| https://huggingface.co/api/models?sort=downloads&direction=-1&limit=200 | every 6h |
| https://huggingface.co/api/models/{model_id} (per-model metadata) | every 6h |
| https://huggingface.co/{model_id}/raw/main/README.md (per-model card) | every 24h |

Every fetch attempt is logged in the fetch_log table and exposed at /modelcard-radar/api/fetch-log. There is no mock, seed, or fallback data — when Hugging Face is down, the dashboard serves the last successful pull and the fetch log shows the error.

Endpoints

All endpoints are mounted under the /modelcard-radar base path. None require auth.

GET /modelcard-radar/ — single-page dark UI
GET /modelcard-radar/health — {ok, ts, db_models, last_fetch}
GET /modelcard-radar/api/stats — totals + per-axis pass rate
GET /modelcard-radar/api/models?sort=score|downloads|likes|last_modified&order=asc|desc&grade=A|B|C|D|F&pipeline_tag=...&limit&offset
GET /modelcard-radar/api/models/{model_id} — full audit with evidence snippets
GET /modelcard-radar/api/orgs — org transparency leaderboard (min 3 models)
GET /modelcard-radar/api/orgs/{name} — single org with 30-day score history
GET /modelcard-radar/api/hall-of-shame — top-20 high-download low-score models
GET /modelcard-radar/api/hall-of-fame — A-grade models by downloads
GET /modelcard-radar/api/risers — orgs whose score moved most over 7 days
GET /modelcard-radar/api/fetch-log?limit=50 — recent fetch attempts

Running locally

npm install
PORT=4772 node server.js
# → http://localhost:4772/modelcard-radar/

The first fetch fires ~1 second after listen (or ~30 seconds if the DB already has rows). It populates ~200 models, audits each one, and rolls up an org_history row for today. Subsequent runs upsert in place; rows are never duplicated.

Configuration

| Env var | Default | Purpose |
|---|---|---|
| PORT | 4772 | listen port |
| DB_PATH | ./data/modelcard-radar.db | SQLite file |
| TOP_LIMIT | 200 | how many top models to audit |
| FETCH_INTERVAL_HOURS | 6 | top-list refresh cadence |
| README_INTERVAL_HOURS | 24 | per-model README refresh cadence |
| USER_AGENT | modelcard-radar/1.0 (+…) | outbound UA for HF |

No ADMIN_PASS. No basic auth. No /admin route. Every endpoint is public.

Storage

SQLite with WAL. Tables:

models — id, author, pipeline, downloads, likes, last_modified, gated, disabled
audits — per-model score, grade, seven axis booleans, evidence JSON
org_history — daily snapshot of (author, day) → avg_score for the Risers view
readme_cache — cached README body + HTTP status
fetch_log — every list/detail/readme fetch attempt with timing

Indexes on models(downloads), models(author), audits(score).

Why this exists

ML procurement, legal, and compliance teams have to answer "is this model safe to ship?" — and the only public artifact is the model card. modelcard-radar puts a number on disclosure quality so teams can shortlist transparent vendors and orgs can see exactly which axes of their cards need work.

License

MIT.