Mock fixtures
Gauntlet dashboard
You are viewing the frontend against mock fixtures. Sign in to switch to live backend data.
Surface
Design the product so insight emerges fast: status first, repeated patterns second, raw evidence last.
Mode
High-signal operator view for debugging agent behavior, docs quality, and product adoption risk.
Demo mode is active. Sign in to load your real projects and runs from the hosted backend.
Recent runs
2
Last 25 runs shown
Succeeded
1
Remote status = succeeded
Failed
0
Remote status = failed
Latest success rate
67%
Batch gauntlet_203
Recent batches
Runs submitted through the CLI or hosted backend.
| Run | Status | Created | Batch | Success | Failures | Top blame |
|---|---|---|---|---|---|---|
| run_demo_001 | succeeded | May 16, 12:00 PM | gauntlet_203 | 67% | 3 | agent |
| run_demo_002 | running | May 16, 2:40 PM | — | 0% | — | — |
Top issue groups
SDK method-path mismatch
Agents emitted Steel.scrape instead of the tool's expected method path.
Repeated grounding loop
Some personas re-grounded instead of executing after docs retrieval.
Latest recommendations
Normalize SDK method-path variants
highMap common SDK shapes onto the expected runtime contract.
Tighten finalization checks
mediumRequire evidence-backed extraction answers before finalizing a run.