Mock fixtures

Gauntlet dashboard

You are viewing the frontend against mock fixtures. Sign in to switch to live backend data.

Surface

Design the product so insight emerges fast: status first, repeated patterns second, raw evidence last.

Mode

High-signal operator view for debugging agent behavior, docs quality, and product adoption risk.

Demo mode is active. Sign in to load your real projects and runs from the hosted backend.

Recent runs

Last 25 runs shown

Succeeded

Remote status = succeeded

Failed

Remote status = failed

Latest success rate

67%

Batch gauntlet_203

Recent batches

Runs submitted through the CLI or hosted backend.

Run	Status	Created	Batch	Success	Failures	Top blame
run_demo_001	succeeded	May 16, 12:00 PM	gauntlet_203	67%	3	agent
run_demo_002	running	May 16, 2:40 PM	—	0%	—	—

Top issue groups

SDK method-path mismatch

Agents emitted Steel.scrape instead of the tool's expected method path.

Repeated grounding loop

Some personas re-grounded instead of executing after docs retrieval.

Latest recommendations

Normalize SDK method-path variants

high

Map common SDK shapes onto the expected runtime contract.

Tighten finalization checks

medium

Require evidence-backed extraction answers before finalizing a run.