agent-recall

Live, public post-merge regret leaderboard for AI coding agents. Tracks AI-coauthored merged PRs across the top public OSS repositories and watches what happens next — reverts (within 14 days) and hotfix-follows (within 7 days). Real public GitHub data, refreshed every 30 minutes, zero auth, no mocks.

Live URL: https://holyai.me/agent-recall/
Port: 4852
Base path: /agent-recall

---

Why

Every existing AI-agent PR tracker stops at the merge boundary.
agent-recall is the first that asks the obvious follow-up question:
when the AI ships, does the code actually survive?

88% of devs in the Sonar State of Code 2026 survey report that AI tools have
produced "code that looked correct but was unreliable."
40% say AI has increased technical debt by introducing duplicative code.
And yet no public dataset measures regret-after-merge per agent.

agent-recall ships exactly that:

Revert rate — how often the standard Revert "<title>" (#N) PR appears
within 14 days of an AI-coauthored merge.
Hotfix-follow rate — within 7 days of an AI merge, did a
fix: / hotfix: / bug: / patch: PR touch any of the same files?
Median Time-To-Regret (MTTR) — the speed of the rollback.
Recall Score (0–100, higher = more regret) — a composite signal.

Tracked agents

Each agent is detected by its public Co-Authored-By: commit trailer.
Eight agents in the v1 roster:

| Agent | Signature |
| ----------------- | ---------------------------------------------------------------------------- |
| Claude Code | Co-Authored-By: Claude <[email protected]> |
| GitHub Copilot | Co-Authored-By: Copilot <[email protected]> |
| Cursor | Co-Authored-By: Cursor Agent <[email protected]> |
| OpenAI Codex | Co-Authored-By: openai-codex[bot] |
| Devin | Co-Authored-By: devin-ai-integration[bot] |
| Aider | Co-Authored-By: Aider |
| Jules / Gemini | Co-Authored-By: Jules <jules-ai[bot]@users.noreply.github.com> |
| Cline | Co-Authored-By: Cline <cline-ai[bot]@users.noreply.github.com> |

Data sources

| Source | Endpoint | Refresh |
| --------------------- | --------------------------------------------------- | --------- |
| GitHub Search API | /search/issues?q=is:pr is:merged … | 30 min |
| GitHub Search API | /search/issues?q=is:pr "Revert" in:title … | 30 min |
| GitHub REST API | /repos/:owner/:name/pulls | 30–60 min |
| GitHub REST API | /repos/:owner/:name/pulls/:n/files | on demand |
| GitHub Search API | /search/repositories?q=stars:>5000 pushed:>=… | 24 h |

All requests are authenticated with a public-scope GitHub PAT. If the token
is missing or invalid, agent-recall boots in degraded mode and serves
only the last cached snapshot (no synthetic data is ever generated).

Local development

cp .env.example .env
# edit .env to add a real GITHUB_TOKEN (public repos only, no scopes needed)
npm install
npm start
# → http://127.0.0.1:4852/agent-recall/

The first ai_merges fetch fires within 30 minutes; if you want to see data
faster you can hit the relevant cron job manually from a Node REPL.

API

All endpoints are public, no auth. CORS open.

| Method | Path | Description |
| ------ | ------------------------------------------ | -------------------------------------------- |
| GET | /agent-recall/health | Liveness check (always 200). |
| GET | /agent-recall/api/leaderboard | Recall Score leaderboard (8 rows). |
| GET | /agent-recall/api/agent/:id | Per-agent 90-day trend + recent regrets. |
| GET | /agent-recall/api/recent-regrets | Latest revert + hotfix events. |
| GET | /agent-recall/api/repos | Monitored repos, ranked by AI-merge volume. |
| GET | /agent-recall/api/stats | Totals + uptime. |
| GET | /agent-recall/api/api-budget | GitHub API budget snapshot (transparency). |
| GET | /agent-recall/api/biggest-regret-week | Fastest revert in the last 7 days. |
| GET | /agent-recall/og/:agent.svg | 1200×630 SVG share card per agent. |

Recall Score formula

recall_score = round(min(100,
  40 * revert_rate +
  30 * hotfix_rate +
  30 * (1 - clamp(mttr_hours, 0, 168) / 168)
))

A revert at hour 1 is twice as bad as one at hour 168, all else equal.
An agent with zero regret events scores 0.

Stack

Node 20, Express 4
better-sqlite3 (WAL mode)
node-cron for the five fetcher jobs
helmet + compression + trust proxy
undici for outbound HTTP
Vanilla-JS SPA frontend, no build step

Caveats

Only Revert "..." (#N) titles are detected reliably. Squash-rollbacks
done by hand without that exact title are missed.
Hotfix detection is heuristic — a follow-up PR touching the same files for
unrelated reasons can be a false positive. The files_overlap_count is
exposed in the per-agent feed so reviewers can audit.
We only attribute to the agent, not to the human who pressed merge.

License

MIT — see LICENSE if present, otherwise treat this file as the license header.

Sister products

merge-trail — merge rates per agent.
agent-bloat — PR diff size per agent.
pr-bounce — rejection reasons per agent.