Live price index
| # | Provider | SKU | Bucket | Method | $/M tok | Raw | Source | Captured |
|---|
Cost calculator
Movers
Provider coverage
Methodology
What we measure
Every SKU is normalized to USD per 1 million training tokens at 1 epoch. Calculators multiply by your dataset_tokens × epochs.
Token-priced providers (direct)
Together, Fireworks, OpenAI, Vertex, Predibase and Mistral all publish per-token pricing. We scrape each provider's official pricing page (URLs in the Providers tab) on a 6–24 hour cron and store the value verbatim.
GPU-time providers (normalized)
Replicate and Modal bill by GPU-second/hour. We convert to $/M-tok using a model-size curve in lib/normalize.js. The curve maps each (GPU SKU × param bucket) to an estimated number of GPU-seconds required to process 1M training tokens at LoRA SFT, batch 16, seq 2048. Numbers are conservative and exposed in the normalization_method field on every price row so you can audit them.
No mock data
If a provider's parser fails (HTTP error, layout change, missing label) the row stays last-known-good. We never invent a price, never use Math.random(), never ship a hardcoded fallback. Failed parses write to a public fetch_errors log surfaced in /api/stats.
Refresh cadence
- Token-priced providers: every 6 hours
- GPU-time providers: every 12 hours
- Predibase, Mistral: every 24 hours
Source code
All fetchers and the normalization curve are open in this repo. See fetchers/*.js.