| Loading ITables v2.6.2 from the internet... (need help?) |
Compare Tools
Note
This comparison is a work in progress. I’m figuring out the best way to present this information for different audiences - some details may be too much, some too nuanced. A crude baseline is still useful: it helps start to understand and categorize a relatively new space. caniuse.com also started somewhere.
Code repo: Eval-Ception on GitHub.
Feedback welcome: Reach out directly (Linktree) or join Eval-Ception Discussions.
Context
Core capabilities
| Loading ITables v2.6.2 from the internet... (need help?) |
Everything all at once
Experimental - this will evolve into a full, filterable matrix.
| Loading ITables v2.6.2 from the internet... (need help?) |
Labels
| Symbol | Meaning |
|---|---|
| Y | Yes - I tested it and it works |
| y | Yes - docs say so, I haven’t verified |
| N | No - I tested it, not supported(*) |
| n | No - docs say so, I haven’t verified(*) |
| P | Partial - I tested it, limited support |
| p | Partial - docs say so, I haven’t verified |
| ? | Unknown - no information |
- (*) In this table, support means built-in or out-of-the-box. If it requires custom code/integration by the user, it is not counted as supported.
Note
I might experiment with a layout using ObservableJS
API
All comparison data is available as JSON for programmatic access.
| Endpoint | Description |
|---|---|
/api/tools.json |
All tools |
/api/tools/{id}.json |
Individual tool (e.g., promptfoo) |
/api/features.json |
Built-in feature definitions and explanations |
/api/schema.json |
JSON Schema for validation |
Example Usage
import requests
# Fetch all tools in python
tools = requests.get("https://ai-evals.io/api/tools.json").json()
for tool in tools["tools"]:
print(f"{tool['name']}: {tool['url']}")// Fetch individual tool in javascript
const promptfoo = await fetch("https://ai-evals.io/api/tools/promptfoo.json")
.then(r => r.json());
console.log(promptfoo.features);The schema follows the JSON Schema specification, making it compatible with code generators, validators, and LLM tooling.