llmstxt-radar
Live adoption tracker, validator, and diff watcher for the/llms.txtstandard. Who ships a realllms.txt? Who lets it rot? Who quietly took it down?
Live at: <https://holyai.me/llmstxt-radar/>
JSON feed: <https://holyai.me/llmstxt-radar/feed.json>
RSS: <https://holyai.me/llmstxt-radar/feed.xml>
Health: <https://holyai.me/llmstxt-radar/health>
What
The /llms.txt proposal (Jeremy Howard / Answer.AI, Sep 2024)
is the closest thing the web has to a "robots.txt for LLMs". By May 2026
hundreds of dev-tool, framework, docs, and SaaS sites publish one — but there
is no public, machine-checked registry of who ships one, how good it is, and
when it changes.
llmstxt-radar watches a curated seed of ~100 domains, fetches /llms.txt
and /llms-full.txt every six hours, parses each file against the official
spec, scores 0–100, and publishes:
- a leaderboard sortable by spec score, freshness, or link count
- a changes feed (JSON Feed + RSS) of every detected add / update /
- remove / restore event
- a per-domain detail view with the parsed snapshot, scoring breakdown,
- and snapshot diff history
- a paste-to-validate widget that scores your own
llms.txtin real time
There is no login. Every endpoint is a public read.
Why this matters
- Dev-tool docs teams can paste their
llms.txtinto the validator and - see exactly which spec rules they fail.
- AI / IDE-agent builders can subscribe to the JSON feed and detect new
-
/llms.txtentrants the same day they ship. - AI docs / SEO writers can track adoption velocity.
- Casual web devs can browse well-crafted examples to copy.
Stack
- Node.js 18+ (ESM)
- Express 4 + helmet + compression + cors
- better-sqlite3 (WAL mode)
- node-cron for scheduled fetches
- Vanilla JS SPA, dark theme, no build step
No build step. No bundler. No TypeScript.
Data sources (all real, all public, no auth)
| Source | What we use it for | Refresh |
|---|---|---|
| Each domain's /llms.txt | The snapshot itself | every 6 h |
| Each domain's /llms-full.txt | Reachability check only (body not stored) | every 6 h |
| <https://raw.githubusercontent.com/SecretiveShell/Awesome-llms-txt/master/README.md> | Bootstrap seed of additional domains | weekly |
| <https://raw.githubusercontent.com/krish-adi/llmstxt-site/master/README.md> | Bootstrap seed of additional domains | weekly |
| seed.json (in this repo) | Hand-curated initial list of dev-tool / framework / SaaS docs | manual |
No mock data anywhere. Every score is computed from a real HTTP fetch
of the listed domain's /llms.txt. If a fetch fails we store the real HTTP
status — we never fabricate content. Math.random() is used only for cron
jitter to avoid thundering herd.
Scoring rubric (0–100)
| Rule | Points |
|---|---|
| File reachable (HTTP 200) | 20 |
| Valid markdown (parses cleanly) | 10 |
| Has H1 title | 10 |
| Has blockquote summary | 10 |
| Has at least one H2 section | 5 |
| Has at least 5 valid links | 10 |
| All links are absolute URLs | 5 |
| All links have a ": description" suffix | 5 |
| File size between 256 B and 64 KB | 5 |
| Companion /llms-full.txt reachable | 10 |
| Served as text/plain or text/markdown | 5 |
| No raw HTML in the body | 5 |
Public endpoints
All under BASE_PATH (default /llmstxt-radar).
GET / → SPA
GET /health → liveness JSON (200)
GET /api/stats → counts + averages
GET /api/leaderboard → sort=score|freshness|links (default score)
GET /api/changes → ?days=30&limit=200
GET /api/domains → ?q=&limit=&offset=
GET /api/domain/:domain → full detail incl. last 10 snapshots
GET /api/domain/:domain/raw → cached body, text/markdown
GET /api/domain/:domain/diff → ?from=:sid&to=:sid (defaults to last two)
POST /api/validate → { content } → score + violations (30/h/IP)
GET /feed.json → JSON Feed 1.1 of changes
GET /feed.xml → RSS 2.0 of changes
CORS is * for all GETs (and the validate POST).
Cron schedule
| Schedule | What |
|---|---|
| 0 /6 | Fetch every active domain's /llms.txt and probe /llms-full.txt |
| 15 3 0 (Sun 03:15 UTC) | Reload seed.json + pull external seed lists |
| boot + 3 s | One-shot fetch pass on first start |
Running locally
cp .env.example .env
npm install
npm start
# → llmstxt-radar listening on :4826 → /llmstxt-radar/
better-sqlite3 needs native bindings. On macOS / Linux arm64 you'll get a
prebuilt binary automatically. On the RNDLAB cowork sandbox the prebuilt
fetch is blocked — that is fine: production deploys to arm64 where prebuilts
are available.
License
MIT.