Observability

What Is OpenTelemetry (for LLMs)?

The vendor-neutral CNCF standard for emitting traces, metrics, and logs, with GenAI semantic conventions for LLM applications.

What Is OpenTelemetry (for LLMs)?

OpenTelemetry — OTel — is the vendor-neutral CNCF standard for emitting telemetry from production software: traces, metrics, and logs, with a defined wire protocol (OTLP) and a standardized attribute schema. For LLM and agent applications, OTel adds the GenAI semantic conventions: a gen_ai.* namespace covering operation type, provider, model, request parameters, tokens, costs, finish reasons, tool calls, and embeddings. Any OTLP-compatible backend — FutureAGI, Arize Phoenix, Langfuse, Datadog, Honeycomb, Tempo — can ingest these spans. In 2026 OTel is the default transport layer for LLM observability.

Why It Matters in Production LLM and Agent Systems

The pre-2024 alternative was vendor lock-in. Every observability vendor shipped a proprietary SDK, a proprietary attribute schema, and a proprietary export protocol. Switching tools meant re-instrumenting every call site. For an LLM stack with 30+ frameworks (OpenAI, Anthropic, LangChain, CrewAI, Pinecone, Weaviate, LiveKit, Pipecat), that re-instrumentation cost is prohibitive.

OTel removes that cost at the SDK layer. Instrumentation owners maintain one set of tracers; backends compete on storage, query, UI, and downstream products like evals and gateways. Switching backends is a config change, not a code change.

The GenAI conventions (currently in development status, version-pinned via OTEL_SEMCONV_STABILITY_OPT_IN) do for LLM data what HTTP semconv did for web apps. gen_ai.usage.input_tokens means the same thing whether the span came from a Python OpenAI call, a TypeScript Anthropic call, or a Java Bedrock call. Cross-service queries — “p99 input tokens for model X across all services last week” — become possible without per-service joins.

The pain of skipping OTel shows up in two ways. First, proprietary SDK debt: a closed-SDK observability tool eats the same surface area traceAI does, but with no portability. Second, schema fragmentation: different teams emit different attribute names (prompt_tokens vs input_tokens vs tokens.in) and dashboards stop comparing across services. OTel’s gen_ai.* solves both at the contract layer.

How FutureAGI Builds on OpenTelemetry

FutureAGI is OTel-native. The instrumentation layer is traceAI — an Apache 2.0 OTel library that auto-instruments 50+ frameworks across Python, TypeScript, Java, and C# and emits spans against the gen_ai.* namespace. Spans export over OTLP to FutureAGI or to any other compatible backend; there is no proprietary SDK requirement.

In a typical Python install:

from fi_instrumentation import register
from traceai_openai import OpenAIInstrumentor

trace_provider = register(project_name="prod-checkout")
OpenAIInstrumentor().instrument(tracer_provider=trace_provider)

Every OpenAI call now emits an OTel span carrying gen_ai.system="openai", gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons, and (opt-in) gen_ai.input.messages / gen_ai.output.messages. The span ships over OTLP gRPC to the FutureAGI collector, where it is persisted in ClickHouse and rendered as a trace tree.

The differentiator vs. closed platforms (LangSmith, Datadog LLM Observability, Braintrust): you can swap backends. Point your OTEL_EXPORTER_OTLP_ENDPOINT at Phoenix or Langfuse and the same instrumentation keeps working. FutureAGI’s value is on top of OTel — span-attached evals via fi.evals.HallucinationScore, agent graph rendering, simulation, and gateway routing — not in the wire protocol. Open instrumentation, choosable backend, no lock-in. That is the OTel posture every LLM observability buyer should look for in 2026.

How to Measure or Detect It

OTel exposes the attributes you measure on:

  • Operation: gen_ai.operation.name (chat, embeddings, retrieval, execute_tool), gen_ai.system, gen_ai.provider.name.
  • Request: gen_ai.request.model, gen_ai.request.temperature, gen_ai.request.max_tokens, gen_ai.request.top_p, gen_ai.request.seed.
  • Response: gen_ai.response.model, gen_ai.response.id, gen_ai.response.finish_reasons.
  • Usage: gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.usage.total_tokens, gen_ai.usage.cache_read_tokens.
  • Latency: gen_ai.server.time_to_first_token, gen_ai.server.time_per_output_token, gen_ai.client.operation.duration.
  • Cost: gen_ai.cost.total, gen_ai.cost.input, gen_ai.cost.output.

A health check: every production LLM call in the last 5 minutes has non-null gen_ai.request.model and non-null gen_ai.usage.input_tokens. If either is null, traceAI is misconfigured or content capture is disabled.

Common Mistakes

  • Treating OTel as just trace plumbing. OTel is also metrics and logs. Use the same exporter for all three; don’t run three different agents.
  • Skipping OTEL_SEMCONV_STABILITY_OPT_IN. The GenAI spec is still pre-stable. Pin the opt-in environment variable so spec churn doesn’t silently break your dashboards.
  • Mixing OpenInference and OTel gen_ai.* on the same span. Pick one. Both work, but duplicated attributes balloon span size and confuse queries.
  • Forgetting context propagation across HTTP boundaries. Use OTel traceparent headers when calling sub-services; otherwise the trace breaks at the network hop.
  • Buying a tool that doesn’t speak OTLP. A proprietary SDK on top of OTel is fine. A proprietary SDK instead of OTel is a switching cost waiting to be paid.

Frequently Asked Questions

What is OpenTelemetry?

OpenTelemetry (OTel) is the vendor-neutral CNCF standard for emitting traces, metrics, and logs, with a defined wire protocol (OTLP) and a standardized attribute namespace. For LLMs it adds the gen_ai.* semantic conventions for prompts, models, tokens, and tool calls.

How is OpenTelemetry different from OpenInference?

OpenTelemetry is the broader CNCF standard. OpenInference is Arize's complementary attribute schema for LLM spans, designed to coexist with OTel. Both produce OTLP-compatible spans; FutureAGI's traceAI emits the official gen_ai.* OTel namespace plus fi.span.kind extensions.

How do you use OpenTelemetry with LLMs?

Install traceAI (FutureAGI's OTel instrumentation library), call register() with your project name, and instrument the framework (OpenAI, LangChain, CrewAI). Spans flow over OTLP to any compatible backend including FutureAGI.