Compare Tools

Note

This comparison is a work in progress. I’m figuring out the best way to present this information for different audiences - some details may be too much, some too nuanced. A crude baseline is still useful: it helps start to understand and categorize a relatively new space. caniuse.com also started somewhere.

Code repo: Eval-Ception on GitHub.

Feedback welcome: Reach out directly (Linktree) or join Eval-Ception Discussions.

Context

Loading ITables v2.6.2 from the internet... (need help?)

Core capabilities

Loading ITables v2.6.2 from the internet... (need help?)

Everything all at once

Experimental - this will evolve into a full, filterable matrix.

Loading ITables v2.6.2 from the internet... (need help?)

Labels

Symbol	Meaning
Y	Yes - I tested it and it works
y	Yes - docs say so, I haven’t verified
N	No - I tested it, not supported^(*)
n	No - docs say so, I haven’t verified^(*)
P	Partial - I tested it, limited support
p	Partial - docs say so, I haven’t verified
?	Unknown - no information

^(*) In this table, support means built-in or out-of-the-box. If it requires custom code/integration by the user, it is not counted as supported.

Note

I might experiment with a layout using ObservableJS

API

All comparison data is available as JSON for programmatic access.

Endpoint	Description
`/api/tools.json`	All tools
`/api/tools/{id}.json`	Individual tool (e.g., `promptfoo`)
`/api/features.json`	Built-in feature definitions and explanations
`/api/schema.json`	JSON Schema for validation

Example Usage

import requests

# Fetch all tools in python
tools = requests.get("https://ai-evals.io/api/tools.json").json()

for tool in tools["tools"]:
    print(f"{tool['name']}: {tool['url']}")

// Fetch individual tool in javascript
const promptfoo = await fetch("https://ai-evals.io/api/tools/promptfoo.json")
  .then(r => r.json());

console.log(promptfoo.features);

The schema follows the JSON Schema specification, making it compatible with code generators, validators, and LLM tooling.