Simulate · Vapi
Vapi logo

Simulate Vapi

Voice & Realtime

Run thousands of adversarial conversations against your Vapi agent before it sees a real user — text or voice, scripted or persona-driven.

python typescript · trace evaluate simulate

Prerequisites

Before you start

  • · A working Vapi app — local or already in production.
  • · A free Future AGI account with FI_API_KEY and FI_SECRET_KEY.
  • · Python 3.9+ / Node 18+ / Java 17+ depending on which SDK you're installing.
  • · An async callable that takes a user message and returns the agent's response.

Install

pip install traceAI-openai

Simulate recipe

from simulate_sdk import CloudEngine, ScenarioGenerator
from simulate_sdk.wrappers import AgentWrapper

# Wrap your Vapi-powered agent with a callable
async def my_agent(user_msg: str) -> str:
    return await vapi_agent.run(user_msg)

scenarios = ScenarioGenerator().generate(
    topic="Vapi edge cases for billing support",
    count=200,
)

report = CloudEngine().run(
    agent=AgentWrapper(my_agent),
    scenarios=scenarios,
    evaluators=["task_completion", "groundedness", "prompt_injection"],
)

report.summary()

What Future AGI captures

Simulate fields you'll see in the dashboard

  • Wrap your Vapi agent with AgentWrapper — sync or async

  • ScenarioGenerator builds personas from a topic + count; load CSV/JSON for hand-crafted ones

  • CloudEngine for text simulation, LiveKitEngine for voice — both produce TestReport objects

  • Every simulated turn becomes a real trace with eval scores attached, so failures debug like prod issues

Common gotchas

Read these before you ship

  1. 01

    AgentWrapper expects a single async function of `(user_msg) -> response_text`. Wrap state externally.

  2. 02

    For voice simulation set the LiveKit room URL and matching API keys in env, not in code.

  3. 03

    Persona generation uses your default eval model — pin a model with `ScenarioGenerator(model="...")` for reproducibility.

Next: chain it with the other recipes

Simulate is the first step. Most teams add an evaluator the same week, and start optimising or simulating once they have a baseline. Each recipe takes minutes to wire up.