Compare Tools

Note

This comparison is a work in progress. I’m figuring out the best way to present this information for different audiences - some details may be too much, some too nuanced. A crude baseline is still useful: it helps start to understand and categorize a relatively new space. caniuse.com also started somewhere.

Code repo: Eval-Ception on GitHub.

Feedback welcome: Reach out directly (Linktree) or join Eval-Ception Discussions.

Context

Loading ITables v2.6.2 from the internet... (need help?)

Core capabilities

Loading ITables v2.6.2 from the internet... (need help?)

Everything all at once

Experimental - this will evolve into a full, filterable matrix.

Loading ITables v2.6.2 from the internet... (need help?)

Labels

Symbol Meaning
Y Yes - I tested it and it works
y Yes - docs say so, I haven’t verified
N No - I tested it, not supported(*)
n No - docs say so, I haven’t verified(*)
P Partial - I tested it, limited support
p Partial - docs say so, I haven’t verified
? Unknown - no information
  • (*) In this table, support means built-in or out-of-the-box. If it requires custom code/integration by the user, it is not counted as supported.
Note

I might experiment with a layout using ObservableJS

API

All comparison data is available as JSON for programmatic access.

Endpoint Description
/api/tools.json All tools
/api/tools/{id}.json Individual tool (e.g., promptfoo)
/api/features.json Built-in feature definitions and explanations
/api/schema.json JSON Schema for validation

Example Usage

import requests

# Fetch all tools in python
tools = requests.get("https://ai-evals.io/api/tools.json").json()

for tool in tools["tools"]:
    print(f"{tool['name']}: {tool['url']}")
// Fetch individual tool in javascript
const promptfoo = await fetch("https://ai-evals.io/api/tools/promptfoo.json")
  .then(r => r.json());

console.log(promptfoo.features);

The schema follows the JSON Schema specification, making it compatible with code generators, validators, and LLM tooling.