Observability

What Is TraceAI?

FutureAGI's open-source OpenTelemetry instrumentation library for LLM and agent frameworks, emitting gen_ai.* compliant spans under Apache 2.0.

What Is TraceAI?

TraceAI is FutureAGI’s open-source instrumentation library that turns LLM and agent code into structured OpenTelemetry traces. It ships drop-in tracers for 50+ frameworks across Python, TypeScript, Java, and C# — model SDKs (OpenAI, Anthropic, Bedrock, Google GenAI, Mistral, Cohere, Groq, Together, xAI), agent frameworks (LangChain, LlamaIndex, CrewAI, OpenAI Agents SDK, Google ADK, AutoGen, Mastra, Pydantic AI, DSPy, Strands, Smolagents, Haystack), voice (LiveKit, Pipecat), gateways (Portkey, LiteLLM), and vector stores (Pinecone, Weaviate, Qdrant, Chroma, LanceDB, Milvus, Mongo, pgvector). Apache 2.0, OTel-native, and exports to any OTLP backend.

Why It Matters in Production LLM and Agent Systems

Without an instrumentation library, every team rebuilds the same wrappers — patching OpenAI.chat.completions.create, capturing prompt/completion strings, computing token counts, threading parent span ids through async tool calls, and stitching agent state into a tree. It is six months of platform work, maintained forever, broken every time an upstream SDK ships a method.

Two failure modes follow from skipping it. First, trace gaps. A LangGraph agent calls a tool, the tool spawns an inner LLM call via a different SDK, the inner call has no parent span id, and the trace tree breaks at the boundary. The user sees a slow turn; the engineer sees half the spans. Second, schema drift. Three teams instrument three differently — one writes tokens_in, another writes prompt_tokens, a third writes usage.input.tokens — and dashboards stop being comparable across services.

TraceAI exists to remove that work. Each integration is maintained against the upstream SDK version it tracks; spans share a single attribute schema (gen_ai.* plus fi.span.kind); parent-child relationships flow through OTel context propagation across threads, async tasks, and process boundaries. The result is one instrumentation contract across the whole agent stack, not 50 bespoke wrappers.

How FutureAGI Builds and Ships TraceAI

FutureAGI’s approach is to keep traceAI deliberately decoupled from the rest of the platform. The library emits OTel spans; they go wherever you point OTLP — FutureAGI, Phoenix, Langfuse, Datadog, Honeycomb, or a self-hosted Tempo. There is no proprietary SDK requirement.

A typical Python install is two lines plus the integration:

from fi_instrumentation import register
from traceai_openai import OpenAIInstrumentor

trace_provider = register(project_name="checkout-agent")
OpenAIInstrumentor().instrument(tracer_provider=trace_provider)

For agents, the traceAI-openai-agents integration captures the OpenAI Agents SDK’s tool-call decisions, handoffs, and trajectory; combined with fi.evals.TrajectoryScore, you get an agent-graph view with eval verdicts at each node. For voice, traceAI-livekit captures STT, LLM, and TTS stages with gen_ai.voice.latency.ttfb_ms per turn. For gateways, traceAI-portkey and traceAI-litellm propagate the trace context across the proxy hop so a single user request stays one trace even when traffic crosses provider boundaries.

The differentiator vs. OpenInference (Arize’s analogous library) is reach into Java and C# (where most enterprise voice and contact-center stacks live), the fi.span.kind taxonomy that distinguishes LLM, RETRIEVER, TOOL, AGENT, CHAIN, EMBEDDING, RERANKER, GUARDRAIL, and EVALUATOR spans, and tight integration with the FutureAGI eval, simulate, and gateway surfaces — the same span ids you debug in production are the ones the simulator replays.

How to Measure or Detect It

TraceAI itself is the instrumentation layer; what it produces is what you measure:

  • Span kind: fi.span.kind distinguishes LLM, RETRIEVER, TOOL, AGENT, CHAIN, EMBEDDING, RERANKER, GUARDRAIL, EVALUATOR.
  • Provider and model: gen_ai.system, gen_ai.provider.name, gen_ai.request.model, gen_ai.response.model.
  • Tokens and cost: gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.cost.total.
  • Latency: gen_ai.client.operation.duration, gen_ai.server.time_to_first_token, gen_ai.server.time_per_output_token.
  • Tool calls: gen_ai.tool.name, gen_ai.tool.call.arguments, gen_ai.tool.call.result.
  • Coverage health: percentage of LLM API calls in your repo that produce a span (target ≥ 99%); orphan-span rate (spans with no parent).

A reliable proxy for “is traceAI working”: every production LLM call in the last 5 minutes has a non-null gen_ai.request.model and a non-null gen_ai.usage.input_tokens.

Common Mistakes

  • Instrumenting only the model SDK. If you skip the agent framework (LangChain, CrewAI), you lose the parent span and the trace becomes a bag of LLM calls with no graph.
  • Forgetting to propagate context across async boundaries. Use OTel context.attach/detach or framework-native helpers; otherwise tool spans orphan from their parent.
  • Mixing two instrumentation libraries on the same stack. Running both traceAI and OpenInference produces duplicate spans. Pick one.
  • Disabling content capture without a redactor. Setting FI_HIDE_LLM_INVOCATION_PARAMETERS=true hides debugging detail; pair it with span-level redaction so you keep tokens and timings while masking PII.
  • Pinning to an old integration version. TraceAI tracks upstream SDKs — pinning to last year’s traceai-openai means new tool-call shapes drop on the floor.

Frequently Asked Questions

What is TraceAI?

TraceAI is FutureAGI's Apache 2.0 OpenTelemetry instrumentation library that auto-instruments 50+ LLM and agent frameworks (OpenAI, Anthropic, LangChain, CrewAI, OpenAI Agents SDK, LiveKit, vector DBs) across Python, TypeScript, Java, and C#.

How is TraceAI different from OpenInference?

Both are OTel-aligned LLM instrumentation libraries. OpenInference is maintained by Arize and is broadly framework-coverage focused; TraceAI is maintained by FutureAGI, ships in four languages including Java and C#, adds an fi.span.kind taxonomy, and integrates directly with FutureAGI's eval, simulate, and gateway products.

How do you use TraceAI in production?

Install the integration package (e.g. traceAI-openai, traceAI-langchain), call register(project_name='prod') from fi_instrumentation, then call the framework's Instrumentor().instrument(). Spans flow to any OTLP backend, including FutureAGI.