Evaluate OpenRouter
Gateways & Routers
Score OpenRouter responses against 70+ purpose-built evaluators — Groundedness, Context Relevance, Prompt Injection, Toxicity, function-call accuracy, and your own custom templates.
Recipes for OpenRouter
Prerequisites
Before you start
- · A working OpenRouter app — local or already in production.
- · A free Future AGI account with
FI_API_KEYandFI_SECRET_KEY. - · Python 3.9+ / Node 18+ / Java 17+ depending on which SDK you're installing.
- · Trace input/output payloads (or a dataset) ready to score.
Install
pip install traceAI-openaiEvaluate recipe
from fi.evals import EvalClient
from fi.evals.templates import (
ContextRelevance, Groundedness, PromptInjection, Toxicity
)
client = EvalClient(api_key="<FI_API_KEY>", secret_key="<FI_SECRET_KEY>")
# Reuse the trace input/output from your OpenRouter run
result = client.evaluate(
eval_templates=[ContextRelevance(), Groundedness(), PromptInjection(), Toxicity()],
inputs=[{
"input": user_query,
"output": openrouter_response,
"context": retrieved_context,
}],
)
print(result.eval_results)What Future AGI captures
Evaluate fields you'll see in the dashboard
-
Run any of the 70+ Future AGI evaluator templates against trace input/output
-
Score in real time on production spans, in CI on a dataset, or as a guardrail before response
-
Custom evaluators via the builder API — heuristic, LLM-as-judge, or fine-tuned Turing models
-
Eval results land back on the originating trace as searchable attributes
Common gotchas
Read these before you ship
- 01
Eval templates expect specific input keys — check the template signature in `fi.evals.templates`.
- 02
For RAG evaluators, pass the retrieved chunks as `context`, not the full document.
- 03
LLM-as-judge templates count against your eval-model token budget — switch to Turing flash for high-volume.
Next: chain it with the other recipes
Evaluate is the first step. Most teams add an evaluator the same week, and start optimising or simulating once they have a baseline. Each recipe takes minutes to wire up.
Adjacent integrations