← back to gallery

Sycophancy Leaderboard

Which LLM pushes back vs. just agrees with you — four axes, live from syco-bench.

aillmbenchmarksycophancyevaluationleaderboard
Open product ↗

syco-board

Live LLM sycophancy leaderboard. Aggregates the public
timfduffy/syco-bench benchmark with arXiv,
Hacker News, and GitHub activity into a single dashboard. Answers the question
"which model will push back vs. just agree with me?".

No authentication. No API keys required to run. Every endpoint is public.

What it tracks

Each model is scored on four 0–5 axes from syco-bench (lower is better):

The composite syco score is the mean of the four axes.

The Δsys column compares each model's score with and without the provider's
recommended system prompt. Positive Δsys means the provider's own deployment is making
their model more sycophantic than the bare model.

Data sources (live, no mock)

| Source | URL | Refresh |
| --- | --- | --- |
| syco-bench master CSV (primary) | https://raw.githubusercontent.com/timfduffy/syco-bench/main/output/20250510_132850_combined_output/master_results.csv | 6h |
| arXiv: all:"sycophancy" | https://export.arxiv.org/api/query?search_query=all:%22sycophancy%22&max_results=50&sortBy=submittedDate&sortOrder=descending | 6h |
| Hacker News Algolia | https://hn.algolia.com/api/v1/search_by_date?query=sycophancy&hitsPerPage=30&tags=(story,comment) | 3h |
| GitHub repos: sycophancy benchmark | https://api.github.com/search/repositories?q=sycophancy+benchmark&sort=updated&per_page=20 | 24h |
| syco-bench commits | https://api.github.com/repos/timfduffy/syco-bench/commits?per_page=20 | 12h |

If a fetcher fails, the previous snapshot is kept and the failure is recorded in the
snapshots table; nothing is mocked.

Running locally

npm install
cp .env.example .env
npm start
# open http://localhost:4850/syco-board/

better-sqlite3 builds a native binding on install. On arm64 macOS / Linux this is a
prebuilt download; if your environment cannot reach nodejs.org, install the
@node-rs/sqlite toolchain or run the production deploy where prebuilds are available.

HTTP endpoints

All under /syco-board:

Environment

| Var | Default | Purpose |
| --- | --- | --- |
| PORT | 4850 | Listen port |
| BASE_PATH | /syco-board | URL prefix (must match nginx route) |
| DB_PATH | ./data/syco-board.db | SQLite path |
| GITHUB_TOKEN | _unset_ | Optional; lifts the 60 req/h anonymous GitHub limit |

Credits

The hard work is upstream: Tim Duffy's
timfduffy/syco-bench supplies the model
scores; the syco-board project just aggregates, ranks, and links it to surrounding
research and discussion. If you use these numbers, cite syco-bench, not syco-board.

License

MIT.