Research

OpenInference vs OpenLLMetry vs OpenLIT in 2026: OTel for LLMs Compared

OpenInference, OpenLLMetry, and OpenLIT compared for OpenTelemetry-based LLM observability in 2026: instrumentation, languages, semconv, and tradeoffs.

October 1, 2025

Updated October 10, 2025

9 min read

openinference-vs-openllmetry opentelemetry-llm llm-observability semantic-conventions instrumentation tracing open-source 2026

Table of Contents

Three OpenTelemetry-aligned instrumentation libraries come up most often for LLM observability in 2026: OpenInference, OpenLLMetry, and OpenLIT. They overlap on the goal (emit OTel spans for LLM calls, vector DB lookups, and framework operations) but differ on language coverage, integration breadth, semantic-convention contribution, and adjacent features. This guide compares the three head to head with the framework-neutral question every team should answer first: which OTel backend will I send these spans to, and which library produces the spans my backend understands best?

TL;DR: Which to pick in 2026

Library	Language coverage	License	Best for	Skip if
traceAI (FutureAGI)	Python, TypeScript, Java, C#	Apache 2.0	Recommended for polyglot OTel GenAI emitter, paired with FutureAGI eval/gateway/guardrails	You only need Python on a non-FutureAGI backend
OpenInference	Python (broad), TS, Java (LangChain4j/Spring AI)	Apache 2.0	Arize Phoenix backend, broad agent framework coverage	Vector DB tracing is the top need
OpenLLMetry	Python; JS/TS via OpenLLMetry-JS; Go via go-openllmetry (manual)	Apache 2.0	Datadog, Honeycomb, New Relic, Grafana, SigNoz backends	GPU metrics or .NET matter
OpenLIT	Python, TypeScript, Go	Apache 2.0	Multi-language SDKs, GPU monitoring, built-in evals	You need .NET or broadest agent-framework count

If you only read one row: traceAI is the recommended Apache 2.0 emitter for production teams that need polyglot OTel GenAI coverage across Python, TypeScript, Java, and C# from one library, paired with the FutureAGI platform out of the box. OpenInference fits the Arize Phoenix path with broad framework coverage; OpenLLMetry fits Python, JS/TS, and (manually-instrumented) Go stacks with deep vector DB and observability-vendor support; OpenLIT fits when GPU monitoring and broader Go auto-instrumentation are in scope. For deeper reads: see the OTel instrumentation tools guide, traceAI deep-dive, and the LLM observability platform buyer guide.

What each library actually is

OpenInference

OpenInference is a set of conventions and plugins complementary to OpenTelemetry that enable tracing of AI applications. The repo had under 1k stars and around 200 forks in the May 2026 snapshot, with Apache 2.0 licensing; refresh the exact counts at publish time. The project is maintained by Arize AI and is the default instrumentation library for the Arize Phoenix backend.

Coverage spans Python (30+ integrations including OpenAI, Anthropic, MistralAI, Groq, Google GenAI, LangChain, LlamaIndex, DSPy, Haystack, CrewAI, PydanticAI, AutoGen, BeeAI, Bedrock, VertexAI, Portkey, LiteLLM, and Instructor), TypeScript (OpenAI, Anthropic, Claude Agent SDK, BeeAI, LangChain.js, AWS Bedrock standard and agent runtime, plus custom integrations for Vercel AI, TanStack AI, and MCP), and Java (LangChain4j, Spring AI, annotation-based instrumentation). Note that AutoGen entered maintenance mode in late 2025; verify the AG2 fork or Microsoft Agent Framework path for new work that depends on AutoGen instrumentation.

The relationship to OpenTelemetry is that OpenInference defines a complementary specification layer. The spec is transport and file-format agnostic. Any OpenTelemetry-compatible backend can ingest OpenInference spans because they are still OTel spans with additional attributes following the OpenInference convention.

OpenLLMetry

OpenLLMetry is a set of extensions built on top of OpenTelemetry that provides tracing and telemetry for LLM applications. The repo lists about 7.1k stars (May 2026 snapshot) and Apache 2.0 licensing. Traceloop’s official docs list Python, JavaScript/TypeScript, and Go getting-started paths: the main traceloop/openllmetry repo is Python, OpenLLMetry-JS is a separate JS/TS project, and traceloop/go-openllmetry is a separate Go SDK that is currently manual instrumentation per the docs (no automatic library instrumentation yet). Teams should evaluate JS/TS and Go needs against those repos specifically.

Coverage spans 15+ LLM providers (OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Groq, Bedrock, Vertex AI, HuggingFace, Ollama, and others), 7 vector databases (Chroma, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, Marqo), 9+ frameworks (LangChain, LlamaIndex, CrewAI, Haystack, LiteLLM, LangGraph, Langflow), and 23+ observability destinations (Datadog, Honeycomb, New Relic, Grafana, SigNoz, Azure Application Insights, Google Cloud, and many others).

The biggest milestone for OpenLLMetry is that its semantic conventions were upstreamed into OpenTelemetry. This means the OTel GenAI semantic conventions in mainline OTel reflect Traceloop’s contribution. For teams that bet on the OTel GenAI standard, OpenLLMetry is one of the most direct paths to that standard.

OpenLIT

OpenLIT is an OpenTelemetry-native instrumentation library with a broader scope than the other two. The repo lists 2.4k stars, Apache 2.0 licensing, and language coverage spanning TypeScript (52%), Python (39.6%), and Go (7.2%). The project provides vendor-neutral SDKs for Python, TypeScript, and Go that send traces and metrics aligned with OpenTelemetry semantic conventions.

Coverage spans 25+ LLM providers (OpenAI, Anthropic, Cohere, Mistral AI, Groq, Google AI Studio, Together AI, Ollama, AWS Bedrock, Azure AI Inference, Vertex AI, vLLM, Reka, LiteLLM, Hugging Face, AI21, GPT4All, PremAI, Sarvam AI, Julep, MultiOn, Replicate, and others) and 20+ AI frameworks (LangChain, LlamaIndex, CrewAI, Pydantic AI, Agno, Browser Use, Haystack, Letta, Mem0, AG2 (AutoGen), Controlflow, Crawl4AI, Dynamiq, OpenAI Agents, Firecrawl, Google ADK, LangGraph, Smolagents, Strands Agents, Claude Agent SDK, MS Agent Framework, Vercel AI). Vector DB coverage spans Pinecone, ChromaDB, Qdrant, Milvus, Astra DB, and PostgreSQL.

OpenLIT’s differentiators are GPU monitoring (OpenTelemetry-native GPU metrics), Fleet Hub for OpAMP-based collector management, 11 built-in evaluation types including hallucination and bias detection, prompt management and vault functionality, and cost tracking with custom and fine-tuned model pricing. Across the post we use the name “Fleet Hub for OpAMP-based collector management” consistently for this capability. The platform positions itself as an end-to-end AI engineering solution rather than purely observability.

Architecture and primitives

Dimension	OpenInference	OpenLLMetry	OpenLIT
Primary language	Python	Python	Python, TypeScript, Go
Number of integrations	30+ Python, 10+ TS, Java	15+ LLM providers, 7 vector DBs, 9+ frameworks, 23+ destinations	25+ LLM, 20+ frameworks
Vector DB coverage	Pinecone, Weaviate, Chroma, Qdrant	7 (Chroma, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, Marqo)	6 (Pinecone, ChromaDB, Qdrant, Milvus, Astra DB, PostgreSQL)
Semantic convention	OpenInference spec, complementary to OTel	OTel GenAI semconv (upstreamed)	OTel GenAI semconv
Backend recommendation	Arize Phoenix, any OTel	Datadog, Honeycomb, New Relic, Grafana, SigNoz	Any OTel, OpenLIT UI
GPU monitoring	No	No	Yes (OTel-native)
Built-in evals	No	No	11 types (hallucination, bias, etc.)
Fleet management	No	No	Yes (Fleet Hub for OpAMP-based collector management)
Cost tracking	Via OTel attributes	Via OTel attributes	Built-in custom pricing
Stars (May 2026 snapshot)	under 1k	about 7k	about 2.4k
License	Apache 2.0	Apache 2.0	Apache 2.0

Side-by-side architecture matrix for OpenInference, OpenLLMetry, and OpenLIT across language coverage, integration count, vector DB depth, semantic-convention status, GPU monitoring, evals, and Fleet Hub; the OTel collector central column shows three instrumentation lines feeding in with a focal cyan halo on the collector representing the vendor-neutral target.

Code samples

OpenInference: instrument LangChain in Python

from openinference.instrumentation.langchain import LangChainInstrumentor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
)

LangChainInstrumentor().instrument()

# Now any LangChain call emits OpenInference-conformant OTel spans
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
response = llm.invoke("Explain OpenInference in one paragraph.")

OpenLLMetry: instrument OpenAI in Python

from traceloop.sdk import Traceloop
from openai import OpenAI

Traceloop.init(app_name="my-app", api_endpoint="http://collector:4318")

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain OpenLLMetry."}],
)
# OTel GenAI semconv spans emitted automatically

OpenLIT: instrument multi-language with one SDK

# Python
import openlit
openlit.init(otlp_endpoint="http://collector:4318")
# All supported LLM providers and frameworks now traced

// TypeScript
import openlit from "openlit";
openlit.init({ otlpEndpoint: "http://collector:4318" });
// Supported LLM providers and frameworks now traced

// Go
import "github.com/openlit/openlit-go"
openlit.Init(openlit.Config{OtlpEndpoint: "http://collector:4318"})

When to pick which

Pick OpenInference if

Your backend is Arize Phoenix or any OTel-compatible system.
You need broad agent framework coverage including LangChain, LlamaIndex, DSPy, CrewAI, AutoGen, BeeAI, Haystack, Pydantic AI.
You need TypeScript or Java instrumentation alongside Python.
You value the OpenInference complementary specification approach.

Pick OpenLLMetry if

Your backend is Datadog, Honeycomb, New Relic, Grafana, or SigNoz, where OpenLLMetry has direct integration paths.
Your stack is Python (or Python plus JS/TS via OpenLLMetry-JS, or Python plus Go via the separate traceloop/go-openllmetry SDK with manual instrumentation per the docs).
You want vector-database tracing depth (7 supported vector DBs).
You value the upstreamed OTel GenAI semantic conventions and want the most direct path to that standard.

Pick OpenLIT if

You need multi-language SDKs (Python, TypeScript, Go).
GPU monitoring is in scope.
You want built-in evals next to instrumentation rather than adding a separate eval layer.
Fleet Hub for OpAMP-based collector management is part of your operational model.

Common mistakes when picking among the three

Treating these libraries as competitors rather than alternatives. They emit OTel spans; the right question is which one fits your backend, languages, and frameworks.
Picking by stars. OpenLLMetry has the most stars in the May 2026 snapshot (about 7k), but OpenInference’s lower star count reflects narrower scope rather than weaker quality. Pick by ecosystem fit.
Mixing libraries on the same call path. If you instrument OpenAI calls with both OpenInference and OpenLLMetry, you get duplicate spans. Pick one library as the primary instrumentation.
Ignoring semantic-convention drift. OTel GenAI semconv is still evolving. Some attributes change names between minor versions. Lock the OTel SDK version in CI.
Skipping the backend test. The same OTel span can render very differently in Arize Phoenix, Langfuse, Datadog, Honeycomb, Grafana, or SigNoz. Test the actual rendering before picking.

Where traceAI is strongest: cross-language coverage

traceAI is FutureAGI’s OTel GenAI semconv emitter and is the recommended pick when polyglot coverage is the constraint. Python, TypeScript, Java, and C# all emit consistent OTel GenAI spans from one library; the same library that the FutureAGI eval, simulation, gateway, and guardrails platform consumes natively. OpenInference covers Python and TypeScript broadly with Java for LangChain4j and Spring AI; OpenLLMetry ships Python, JS/TS, and Go via separate SDK repos (Go is manual instrumentation per the docs); OpenLIT covers Python, TypeScript, and Go but does not ship .NET. If your stack runs across all four major LLM languages from a single library, traceAI is the most direct single-library fit for that requirement.

For teams that want eval scores attached to spans as part of the same instrumentation layer, FutureAGI’s traceAI plus the Agent Command Center gateway plus the eval pipeline form one vertically-integrated stack: 50+ eval metrics attach as span attributes, the gateway routes 100+ providers with BYOK and emits its own spans into the same trace tree, and turing_flash runs guardrail screening at 50 to 70 ms p95 inline across 18+ guardrail types, with full eval templates running roughly 1 to 2 seconds when a deeper rubric is needed. For teams that prefer to keep the instrumentation library and eval layer separate, picking OpenInference, OpenLLMetry, or OpenLIT for instrumentation and any eval product for scoring is equally valid. All four libraries emit OTel spans; the eval layer reads OTel spans.

Sources

Series cross-link

Next: Best OTel Instrumentation Tools for LLMs, traceAI Deep-Dive, LLM Observability Buyer Guide

Frequently asked questions

Which OpenTelemetry LLM instrumentation library should I pick in 2026?

FutureAGI's traceAI is the recommended pick on the cross-language coverage axis: Python, TypeScript, Java, and C# all emit OTel GenAI semconv spans from one library, the same library that the FutureAGI eval, simulation, gateway, and guardrails platform consumes. OpenInference fits when the backend is Arize Phoenix and you want broad agent framework coverage. OpenLLMetry fits when the backend is Datadog, Honeycomb, New Relic, Grafana, or SigNoz; the official Traceloop docs list Python, JavaScript/TypeScript, and Go getting-started paths via separate SDK repos (`traceloop/openllmetry`, OpenLLMetry-JS, and `traceloop/go-openllmetry`), with Go currently manual instrumentation per the docs. OpenLIT fits when GPU monitoring is in scope and broad Go auto-instrumentation matters. All four emit OTel-compatible spans; pick by which one matches your language footprint and downstream backend.

Are these libraries OpenTelemetry-compatible?

Yes. All three emit OpenTelemetry spans and align with OpenTelemetry semantic conventions for GenAI. OpenLLMetry's semantic conventions were upstreamed into OpenTelemetry, which is a meaningful milestone. OpenInference defines a complementary spec layer specifically for LLM applications. OpenLIT aligns with OTel semconv and adds GPU monitoring and OpAMP fleet management. All three work with any OTel-compatible backend.

What languages do these libraries support?

OpenInference covers Python (30+ integrations), TypeScript (10+ integrations), and Java (LangChain4j, Spring AI, annotation-based). OpenLLMetry ships Python, JavaScript/TypeScript, and Go via separate SDK repos (`traceloop/openllmetry` for Python, OpenLLMetry-JS for JS/TS, and `traceloop/go-openllmetry` for Go); Go currently has manual instrumentation per the Traceloop docs. OpenLIT covers Python, TypeScript, and Go with vendor-neutral SDKs and broader Go auto-instrumentation. If your stack is Java, OpenInference is the broadest pick. If you need Python, JS/TS, or Go, OpenLLMetry and OpenLIT both apply.

Which library has the most framework integrations?

OpenLIT lists 25+ LLM providers and 20+ AI frameworks including LangChain, LlamaIndex, CrewAI, Pydantic AI, Haystack, AG2 (AutoGen), Google ADK, LangGraph, Smolagents, Strands Agents, Claude Agent SDK, MS Agent Framework, and Vercel AI. OpenInference lists 30+ Python integrations across LLM providers, RAG, agents, and observability tools, plus 10+ JS and Java integrations. OpenLLMetry lists 15+ providers, 9+ frameworks, and 7 vector databases. Coverage breadth differs more by ecosystem fit than by raw count.

Can I use these libraries together?

Yes for compatible scopes. All three emit OTel spans, so a single OTel collector can ingest from any of them. The risk is duplicate spans if two libraries instrument the same call (e.g., both OpenInference and OpenLLMetry on OpenAI). The right approach is to pick one as the primary instrumentation and use the others only for non-overlapping integrations. Most teams pick one library and stick with it for consistency.

How do these compare to traceAI?

[traceAI](https://github.com/future-agi/traceAI) is FutureAGI's OTel GenAI semconv emitter and is the recommended pick on the cross-language coverage axis: Python, TypeScript, Java, and C# from one library, paired with the FutureAGI eval, simulation, gateway, and guardrails platform out of the box. OpenInference covers Python and TypeScript broadly with Java for LangChain4j and Spring AI; OpenLLMetry ships Python, JS/TS, and Go via separate SDK repos (Go is manual instrumentation per the docs); OpenLIT covers Python, TypeScript, and Go but does not ship .NET. If polyglot Python plus TypeScript plus Java plus C# instrumentation is the requirement, traceAI is the most direct single-library fit. All four emit standard OTel spans any OTel backend can ingest.

What is the difference between OpenInference and OpenTelemetry GenAI semconv?

OpenInference is a complementary specification that defines semantic conventions for AI applications, sitting on top of OpenTelemetry. The OpenTelemetry GenAI semantic conventions are the upstream standard inside OpenTelemetry itself. OpenLLMetry's conventions were upstreamed into OpenTelemetry, which is a notable formal contribution to the GenAI semantic conventions. Most modern LLM observability tools support both OpenInference attributes and OTel GenAI semconv attributes.

Which library is best for vector database tracing?

OpenLLMetry has the most vector-database breadth out of the box: Chroma, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, and Marqo. OpenInference covers Pinecone, Weaviate, Chroma, Qdrant via integration packages. OpenLIT covers Pinecone, ChromaDB, Qdrant, Milvus, Astra DB, and PostgreSQL/pgvector. If vector-DB tracing is the top requirement, test each on your specific vector store before picking.