tool-use-arena LLM function-calling leaderboard
— snapshots — models top: — updated: —
# Model Org License Overall % Δ7d Multi-Turn % Web Search % Memory % Cost $ Latency s

Data source: Berkeley Function-Calling Leaderboard (BFCL v4). Refreshed every 6 hours. N/A = not evaluated on that category.