inspect_ai

Inspect AI core: composable eval tasks, sandboxes, scorers, and multi-model runs; the framework behind inspect_evals, not just the task bundle.

evalssandboxpython
Stars
2.2k
Adoption surface
complex
Autonomy
headless
Recovery
resumable
License
✅ open-source
Category
Evaluation and benchmarking harnesses

Repository ↗ Example: Inspect tutorial example ↗

Related in Evaluation and benchmarking harnesses