← back to gallery

Agent Ready

Score how AI-friendly any public website is — llms.txt, AI bot policies, ai-agent.json, sitemap and more

dev-toolsai-agentsllms-txtagent-readinessweb-standardsleaderboarddeveloper-tools
Open product ↗

agent-ready

Live agent-readiness score for any public website.

agent-ready measures how friendly a public website is to AI agents — LLM crawlers, autonomous agents, AI search products — by inspecting six well-known signals via a single live HTTP probe. No mocks, no API keys for the core flow, no auth.

Think PageSpeed Insights but for AI-agent readiness.

What it measures

Every signal is a single public HTTP GET against the candidate domain. A failed fetch contributes 0 points and the failure mode is recorded in the per-check log.

| Signal | Max | URL we fetch | What it scores |
|---|---:|---|---|
| llms.txt | 25 | https://<domain>/llms.txt | 200 OK, text-ish, body > 100 bytes. +5 if it parses as the proposed Markdown structure (H1 + sections). |
| llms-full.txt | 10 | https://<domain>/llms-full.txt | 200 OK, text-ish, body > 500 bytes. |
| AI bot policy in robots.txt | 25 | https://<domain>/robots.txt | 3 points per distinct AI bot directive (GPTBot, ClaudeBot, anthropic-ai, OAI-SearchBot, Google-Extended, PerplexityBot, ByteSpider, CCBot, Applebot-Extended, Amazonbot, FacebookBot, Meta-ExternalAgent, cohere-ai, Diffbot, Omgilibot, YouBot, DuckAssistBot, Claude-Web, ChatGPT-User, PerplexityUser). Capped at 25. 5 points if robots.txt exists but has no AI-bot directives. |
| /.well-known/ai-agent.json | 15 | https://<domain>/.well-known/ai-agent.json | 200 OK + parses as JSON + has at least one of name, description, actions, endpoints, agents, tools, capabilities (Aiia spec). |
| /ai.txt | 10 | https://<domain>/ai.txt (and fallback https://<domain>/.well-known/ai.txt) | 200 OK, body ≥ 10 bytes. |
| Sitemap declared | 15 | https://<domain>/sitemap.xml (plus parses Sitemap: directives in robots.txt) | 15 if /sitemap.xml is a valid <urlset> or <sitemapindex>. 9 if only declared via robots.txt. |

Total possible: 100. Letter grades: A (≥85), B (≥70), C (≥50), D (≥30), F (<30).

Refresh cadence

API

Base path is /agent-ready. All endpoints return JSON { ok, data?, error? }.

| Method | Path | Notes |
|---|---|---|
| GET | /agent-ready/health | Auth-free, must 200. |
| GET | /agent-ready/api/sites?sort=score|recent&q=&limit=&offset= | Tracked domains list. |
| GET | /agent-ready/api/site/:domain | Latest score + 30 most recent checks with full signal breakdown. |
| GET | /agent-ready/api/stats | Index-wide totals, signal coverage %, grade spread. |
| GET | /agent-ready/api/movers?days=7 | Domains whose score changed since previous check. |
| GET | /agent-ready/api/categories | Average score per category. |
| POST | /agent-ready/api/check | Body { "domain": "example.com" }. Runs all 6 probes live, persists, returns full signal JSON. |

There is no auth, no admin, no API key required to call any endpoint.

Running locally

npm install
cp .env.example .env
npm start

Then open <http://localhost:4748/agent-ready/>.

The first launch seeds the tracked-domain table from data/seed-domains.json and kicks off a single background refresh (~3 minutes for ~150 domains, batched 8-parallel). The server listens immediately — /health does not block on the refresh.

No-mock guarantee

This product complies with the cowork R&D mock-data ban. Specifically:

Stack

Layout

agent-ready/
  server.js                Express bootstrap, BASE_PATH mount, cron registration
  db.js                    better-sqlite3 + schema + prepared statements
  routes/api.js            All /api/* handlers
  lib/check.js             Single-domain probe — 6 parallel fetches with timeout
  lib/score.js             Signal weights + grade helper + AI bot list
  lib/parse-robots.js      robots.txt parser (extracts AI-bot directives + Sitemap)
  lib/seed.js              First-boot seed of the tracked-domain table
  lib/rate-limit.js        Per-IP token bucket for POST /api/check
  crons/refresh.js         6h refresh of all tracked domains, batched 8-parallel
  crons/housekeeping.js    24h prune + VACUUM
  data/seed-domains.json   ~150 curated input domains (config, not data)
  public/index.html        SPA shell
  public/app.js            Vanilla-JS UI (leaderboard, stats, movers, live check)
  public/style.css         Dark theme

Why this exists

llms.txt has reached ~10% adoption across 300k crawled domains. WAB (Web Agent Bridge), Chrome's WebMCP, and the Aiia ai-agent.json spec all launched between January and May 2026. There is no public tracker of which top sites have adopted these standards. agent-ready ships that tracker, plus a single-domain checker so any site owner can audit their own property in seconds.

License

MIT.