grep-tax

How much does your repo cost an AI agent to navigate?

grep-tax runs a fixed 5-question benchmark against any public GitHub repo, comparing naive grep+read (what Claude Code and Cursor do today) against BM25 chunk retrieval. Out comes a letter grade, a token-per-query number, and a shareable README badge.

Paste a public GitHub repo URL. ~30โ€“90 s for small repos.

๐Ÿ† Hall of fame

No scans yet. Submit a repo above.

๐Ÿ’€ Hall of shame

No scans yet.

๐Ÿ•’ Recently scanned

No scans yet.

How grep-tax works

Every scan runs the same five navigation questions a developer (or an AI coding agent) would ask of an unfamiliar repo: where is auth handled, where are env vars read, where are HTTP routes defined, where is the database initialized, and where is error handling and logging configured.

For each question we run two strategies and count tokens with gpt-tokenizer (cl100k_base, the same tokenizer family Cursor and Claude Code charge against):

Ground truth is deterministic: each question has a regex set; any file containing a match is considered relevant. The methodology section on every scorecard lists exactly which regexes were used so the grade is reproducible.

Out of scope: private repos, custom queries, real embeddings, language-specific parsing. The point is to give maintainers a fast, comparable number โ€” not a perfect benchmark.