Evals & safety
Benchmarks, red-teaming, guardrails, and monitoring.
10·Free resources
0 of 10 resources completed
Log in to track progressLog in to mark resources complete and sync progress across devices.
- Article35 min
Anthropic - Red teaming language models
Structured adversarial testing before release.
Open resource - Docs30 min
OpenAI - Evals repository
YAML-driven benchmarks you can extend for your product.
Open resource - Article30 min
OWASP LLM Top 10
Threat categories from prompt injection to insecure plugins.
Open resource - Article15 min
LMSYS Chatbot Arena
Crowdsourced Elo rankings for model comparison intuition.
Open resource - Docs20 min
Microsoft - Azure OpenAI content filters
Configurable moderation tiers for APIs.
Open resource - Docs28 min
Anthropic - Evaluating Claude
Structured evals, red-team loops, and regression checks.
Open resource