Cookbook

Eval this site

Learn about evals by running them yourself - against this very site.

  • Eval-Ception - can your agent pass the exam to speak on behalf of ai-evals.io? Hands-on tutorial using Promptfoo.

Have a cookbook to share?

I welcome external contributions:

  • They should be runnable in a few commands and simple parameters, not just principles.
  • It should be small enough code as to be auditable and take the user to eval results within 10 minutes (sans LLMs running time).
  • Security expectations might evolve over time as we find the right balance between easy and secure defaults. I don’t want to create bad habits or unnecessary attack vectors while also recognizing that people will just use much of what they find.

Feedback welcome: Reach out directly (Linktree) or join Eval-Ception Discussions.