← back to gallery

Trace Grid

Live comparison grid for LLM & agent observability tools — momentum, releases, pricing drift

aillmobservabilitytracingagentevalsleaderboarddev-tools
Open product ↗

trace-grid

Live comparison grid for LLM and agent observability tools — GitHub momentum, release cadence, SDK download velocity, deployment model and pricing-page drift, refreshed automatically. No mocks. No auth. Public.

The LLM observability category exploded in 2026: Langfuse was acquired by ClickHouse in January, Arize Phoenix went OTel-native, AgentOps expanded to 400+ LLMs, Helicone shipped built-in caching, Pydantic launched Logfire, Datadog rolled out LLM Observability, plus a wave of newcomers (Laminar, Latitude, Tracea, Fabraix). Choosing a stack is now harder than choosing a vector DB was in 2023. trace-grid is the page that says, as of right now, how each vendor stacks up — with momentum, not marketing.

Live: <https://holyai.me/trace-grid/>

What's on the board

For every tracked tool, the grid shows:

You can filter by deploy model and category (tracing / evals / prompt / agent), search, and tick up to four rows to open a side-by-side compare view. The detail panel opens an embeddable SVG badge per tool.

Data sources — real, runtime, no mocks

| Source | URL pattern | Refresh frequency |
|---|---|---|
| GitHub repo metadata | https://api.github.com/repos/{owner}/{name} | hourly |
| GitHub releases | https://api.github.com/repos/{owner}/{name}/releases?per_page=10 | every 4 hours |
| GitHub commits | https://api.github.com/repos/{owner}/{name}/commits?per_page=100 | every 4 hours |
| npm registry meta | https://registry.npmjs.org/{package} | every 6 hours |
| npm download counts | https://api.npmjs.org/downloads/range/last-week/{package} | every 6 hours |
| pypi meta | https://pypi.org/pypi/{package}/json | every 6 hours |
| pypi downloads | https://pypistats.org/api/packages/{package}/recent?period=week | every 6 hours |
| Pricing pages (HTML) | the vendor's public /pricing URL | every 12 hours |
| GitHub topic search | https://api.github.com/search/repositories?q=topic:llm-observability+stars:>50 | daily |

Each entry above maps to a real fetch() call in fetchers/*.js. Every call gets logged to the fetch_log table with HTTP status, duration, and error. The Methodology tab in the SPA lists every URL and its last fetch result — the visible audit trail is the guarantee that no number is fabricated.

Pricing pages are intentionally not parsed for dollar amounts. Vendors structure pricing very differently (per-trace / per-span / per-seat / enterprise quote / free tier) and a naive extractor would lie. Instead we SHA-256 the visible-text fingerprint after stripping scripts/styles, and surface the fact of a change with a link to the source.

Tracked tools (initial)

Langfuse · Arize Phoenix · Helicone · AgentOps · LangSmith · W&B Weave · OpenLLMetry · Lunary · Comet Opik · Pydantic Logfire · Laminar · Latitude · Braintrust · Galileo · HoneyHive.

To add or remove a tool, edit tools.config.js and redeploy. The next boot upserts the identity records and the cron picks up metrics on the next sweep.

Endpoints (all public, no auth)

Run locally

npm install                  # better-sqlite3 is a native module, you'll need build tools
cp .env.example .env
node server.js
# → http://localhost:4758/trace-grid/

A boot sweep runs ~8 s after listen; the grid fills in as the sweep completes. With no GH_TOKEN, GitHub anonymous rate-limit is 60 req/h and the cron schedule keeps total calls comfortably under that.

Stack

License

MIT.