agent-recall
Live, public post-merge regret leaderboard for AI coding agents. Tracks AI-coauthored merged PRs across the top public OSS repositories and watches what happens next — reverts (within 14 days) and hotfix-follows (within 7 days). Real public GitHub data, refreshed every 30 minutes, zero auth, no mocks.
Live URL: https://holyai.me/agent-recall/
Port: 4852
Base path: /agent-recall
---
Why
Every existing AI-agent PR tracker stops at the merge boundary.agent-recall is the first that asks the obvious follow-up question:
when the AI ships, does the code actually survive?
- 88% of devs in the Sonar State of Code 2026 survey report that AI tools have
- produced "code that looked correct but was unreliable."
- 40% say AI has increased technical debt by introducing duplicative code.
- And yet no public dataset measures regret-after-merge per agent.
agent-recall ships exactly that:
- Revert rate — how often the standard
Revert "<title>" (#N)PR appears - within 14 days of an AI-coauthored merge.
- Hotfix-follow rate — within 7 days of an AI merge, did a
-
fix:/hotfix:/bug:/patch:PR touch any of the same files? - Median Time-To-Regret (MTTR) — the speed of the rollback.
- Recall Score (0–100, higher = more regret) — a composite signal.
Tracked agents
Each agent is detected by its public Co-Authored-By: commit trailer.
Eight agents in the v1 roster:
| Agent | Signature |
| ----------------- | ---------------------------------------------------------------------------- |
| Claude Code | Co-Authored-By: Claude <[email protected]> |
| GitHub Copilot | Co-Authored-By: Copilot <[email protected]> |
| Cursor | Co-Authored-By: Cursor Agent <[email protected]> |
| OpenAI Codex | Co-Authored-By: openai-codex[bot] |
| Devin | Co-Authored-By: devin-ai-integration[bot] |
| Aider | Co-Authored-By: Aider |
| Jules / Gemini | Co-Authored-By: Jules <jules-ai[bot]@users.noreply.github.com> |
| Cline | Co-Authored-By: Cline <cline-ai[bot]@users.noreply.github.com> |
Data sources
| Source | Endpoint | Refresh |
| --------------------- | --------------------------------------------------- | --------- |
| GitHub Search API | /search/issues?q=is:pr is:merged … | 30 min |
| GitHub Search API | /search/issues?q=is:pr "Revert" in:title … | 30 min |
| GitHub REST API | /repos/:owner/:name/pulls | 30–60 min |
| GitHub REST API | /repos/:owner/:name/pulls/:n/files | on demand |
| GitHub Search API | /search/repositories?q=stars:>5000 pushed:>=… | 24 h |
All requests are authenticated with a public-scope GitHub PAT. If the token
is missing or invalid, agent-recall boots in degraded mode and serves
only the last cached snapshot (no synthetic data is ever generated).
Local development
cp .env.example .env
# edit .env to add a real GITHUB_TOKEN (public repos only, no scopes needed)
npm install
npm start
# → http://127.0.0.1:4852/agent-recall/
The first ai_merges fetch fires within 30 minutes; if you want to see data
faster you can hit the relevant cron job manually from a Node REPL.
API
All endpoints are public, no auth. CORS open.
| Method | Path | Description |
| ------ | ------------------------------------------ | -------------------------------------------- |
| GET | /agent-recall/health | Liveness check (always 200). |
| GET | /agent-recall/api/leaderboard | Recall Score leaderboard (8 rows). |
| GET | /agent-recall/api/agent/:id | Per-agent 90-day trend + recent regrets. |
| GET | /agent-recall/api/recent-regrets | Latest revert + hotfix events. |
| GET | /agent-recall/api/repos | Monitored repos, ranked by AI-merge volume. |
| GET | /agent-recall/api/stats | Totals + uptime. |
| GET | /agent-recall/api/api-budget | GitHub API budget snapshot (transparency). |
| GET | /agent-recall/api/biggest-regret-week | Fastest revert in the last 7 days. |
| GET | /agent-recall/og/:agent.svg | 1200×630 SVG share card per agent. |
Recall Score formula
recall_score = round(min(100,
40 * revert_rate +
30 * hotfix_rate +
30 * (1 - clamp(mttr_hours, 0, 168) / 168)
))
A revert at hour 1 is twice as bad as one at hour 168, all else equal.
An agent with zero regret events scores 0.
Stack
- Node 20, Express 4
better-sqlite3(WAL mode)node-cronfor the five fetcher jobshelmet+compression+trust proxyundicifor outbound HTTP- Vanilla-JS SPA frontend, no build step
Caveats
- Only
Revert "..." (#N)titles are detected reliably. Squash-rollbacks - done by hand without that exact title are missed.
- Hotfix detection is heuristic — a follow-up PR touching the same files for
- unrelated reasons can be a false positive. The
files_overlap_countis - exposed in the per-agent feed so reviewers can audit.
- We only attribute to the agent, not to the human who pressed merge.
License
MIT — see LICENSE if present, otherwise treat this file as the license header.
Sister products
- merge-trail — merge rates per agent.
- agent-bloat — PR diff size per agent.
- pr-bounce — rejection reasons per agent.