Simulate · Anthropic

Simulate Anthropic

LLM Providers

Run thousands of adversarial conversations against your Anthropic agent before it sees a real user — text or voice, scripted or persona-driven.

python typescript java · trace evaluate optimize simulate

Start free Open in docs View example on GitHub

Recipes for Anthropic

Trace Anthropic Evaluate Anthropic Optimize Anthropic Simulate Anthropic live

Prerequisites

Before you start

· A working Anthropic app — local or already in production.
· A free Future AGI account with FI_API_KEY and FI_SECRET_KEY.
· Python 3.9+ / Node 18+ / Java 17+ depending on which SDK you're installing.
· An async callable that takes a user message and returns the agent's response.

Install

pip install traceAI-anthropic

npm install @traceai/anthropic

<dependency>
  <groupId>ai.futureagi</groupId>
  <artifactId>traceai-java-anthropic</artifactId>
  <version>LATEST</version>
</dependency>

Simulate recipe

from simulate_sdk import CloudEngine, ScenarioGenerator
from simulate_sdk.wrappers import AgentWrapper

# Wrap your Anthropic-powered agent with a callable
async def my_agent(user_msg: str) -> str:
    return await anthropic_agent.run(user_msg)

scenarios = ScenarioGenerator().generate(
    topic="Anthropic edge cases for billing support",
    count=200,
)

report = CloudEngine().run(
    agent=AgentWrapper(my_agent),
    scenarios=scenarios,
    evaluators=["task_completion", "groundedness", "prompt_injection"],
)

report.summary()

What Future AGI captures

Simulate fields you'll see in the dashboard

Wrap your Anthropic agent with AgentWrapper — sync or async
ScenarioGenerator builds personas from a topic + count; load CSV/JSON for hand-crafted ones
CloudEngine for text simulation, LiveKitEngine for voice — both produce TestReport objects
Every simulated turn becomes a real trace with eval scores attached, so failures debug like prod issues

Common gotchas

Read these before you ship

01
AgentWrapper expects a single async function of `(user_msg) -> response_text`. Wrap state externally.
02
For voice simulation set the LiveKit room URL and matching API keys in env, not in code.
03
Persona generation uses your default eval model — pin a model with `ScenarioGenerator(model="...")` for reproducibility.

Next: chain it with the other recipes

Simulate is the first step. Most teams add an evaluator the same week, and start optimising or simulating once they have a baseline. Each recipe takes minutes to wire up.

Start free Read the Simulate docs

Adjacent integrations