skill-clash

Stop your skills from fighting each other. Conflict, overlap and duplication detector for AI agent skill collections. Live at https://holyai.me/skill-clash/.

skill-clash scans any public GitHub repo containing SKILL.md / AGENTS.md / CLAUDE.md files and computes a 4-dimensional clash report:

Trigger collision — do two skills fire on the same prompt?
Redundancy — are two skills near-duplicates?
Contradiction — do two skills give opposing instructions?
Combined token tax — how heavy is the autoload?

Each dimension is scored 0–10. Total: 0–40. Verdicts range from Clean room (36+) to Demolition derby (≤ 10).

Why this exists

The Agent Skills ecosystem exploded in 2026. AGENTS.md is adopted by 60,000+ open-source projects; Skills.sh hosts 91k+ skills; LobeHub, Agensi, ClaudeSkills.info, ComposioHQ, Microsoft, Anthropic each maintain registries. A May 2026 community audit found 73% of 214 audited skills scored below 60/100 for basic quality.

When developers install 20+ skills, the skills routinely fire on the same trigger phrases, duplicate each other's instructions, or contradict each other outright. There is no tool that scans a skill collection and tells you which skills will clash. skill-clash is that tool.

Run locally

cp .env.example .env
# optional: drop a GitHub token in for higher rate limits
npm install
npm start
# → http://localhost:4824/skill-clash/

Database is a single sqlite file at data/skill-clash.db (WAL mode). No external services required.

No auth, ever

Every endpoint — reads, writes, refresh, scan — is public. No Basic Auth, no API keys for users, no admin pass. We want anyone to be able to drop a repo in and get a report instantly.

Endpoints

All under BASE_PATH=/skill-clash:

| Method | Path | Purpose |
|---|---|---|
| GET | /skill-clash/health | Liveness — returns 200 always |
| GET | /skill-clash/api/leaderboard?limit=50 | Top-scoring collections |
| GET | /skill-clash/api/worst?limit=20 | Lowest-scoring (the "demolition derby") |
| GET | /skill-clash/api/clean?limit=20 | Cleanest collections (alias of leaderboard) |
| GET | /skill-clash/api/recent?limit=20 | Most recently scanned |
| GET | /skill-clash/api/stats | Ecosystem aggregates |
| GET | /skill-clash/api/conflict-types | Histogram of conflict types |
| GET | /skill-clash/api/collection/:owner/:name | Full report + per-pair conflicts |
| POST | /skill-clash/api/scan | Body: {repo:"owner/name", path:"<optional>"} |
| POST | /skill-clash/api/refresh | Manual cron nudge (rate-limited 1/min/IP) |
| GET | /skill-clash/api/share/:owner/:name.svg | OG-style SVG share card |

Data sources — all live, all public

We never fabricate. Every score comes from a live GitHub API or raw file fetch. If a fetch fails we surface an honest empty state with error_reason.

| Repo | Pattern | Refresh |
|---|---|---|
| anthropics/skills | /SKILL.md | every 6h |
| microsoft/skills | /SKILL.md | every 6h |
| mattpocock/skills | /SKILL.md | every 6h |
| addyosmani/agent-skills | /{SKILL,AGENTS}.md | every 6h |
| VoltAgent/awesome-agent-skills | **/.md | every 6h |
| ComposioHQ/awesome-claude-skills | /SKILL.md | every 6h |
| agentsmd/agents.md | /AGENTS.md | every 6h |
| netresearch/claude-code-marketplace | /SKILL.md | every 6h |
| heilcheng/awesome-agent-skills | /.md | every 6h |
| antfu/skills | /SKILL.md | every 6h |
| forrestchang/andrej-karpathy-skills | /CLAUDE.md | every 6h |
| User-submitted | per-request via POST /api/scan | cached 1h |

GitHub API uses GITHUB_TOKEN from env for higher rate limit when available. Falls back to unauthenticated. On 403/429 we exponentially back off and never fabricate a score.

How clash scoring works

For a collection of N skills, we extract triggers + imperatives + content hashes from each skill, then run pairwise comparisons.

Trigger collision (0–10): Jaccard on extracted triggers; tf-idf cosine on descriptions. A pair collides if Jaccard ≥ 0.3 OR cosine ≥ 0.55. Score linear-ramps from 10 (no collisions) to 0 (≥ N/4 collision pairs).

Redundancy (0–10): SHA-256 (exact dup → red), 64-perm MinHash on 5-character shingles (> 0.85 → orange, > 0.65 + token-set contained > 0.5 → yellow). Score = 10 − 2×red − 1×orange − 0.5×yellow.

Contradiction (0–10): Extract (must / always / prefer) and (never / do not / avoid) imperatives, then look for matching phrases with opposite polarity, requiring trigger or imperative overlap so unrelated skills don't false-positive. Score = 10 − 2 × contradiction pairs.

Token tax (0–10): Sum of frontmatter + description tokens (≈ chars / 3.5) across all skills. ≤ 1.5k = 10, ≤ 3k = 8, ≤ 6k = 5, ≤ 10k = 2, otherwise 0.

Verdicts

| Score | Stars | Label |
|---|---|---|
| 36–40 | ★★★★★ | Clean room |
| 28–35 | ★★★★ | Mostly clear |
| 20–27 | ★★★ | Some friction |
| 11–19 | ★★ | Tangled |
| 0–10 | ★ | Demolition derby |

Stack

Node 18+, Express 4, better-sqlite3 (WAL), node-cron, helmet, compression, morgan, yaml. Vanilla JS SPA — no framework, no build step.

What's out of scope (v1)

No LLM-based contradiction detection. We use deterministic heuristics only. v2 may add an optional LLM upgrade for ambiguous pairs.
No paid-marketplace scraping (Agensi, Skills.sh require auth — out of scope).
No private repos.
No skill quality grading beyond clash dimensions — karpathy-score already covers per-file grading.

License

MIT.