What Is an Ontology in AI? Definition & Examples (2026)

What Is an Ontology?

An ontology is a formal, machine-readable specification of a domain. It declares the types of entities that exist (Person, Order, Drug), the relationships between them (Person prescribes Drug to Person), the hierarchies that organise them (Cardiologist is-a Doctor is-a Person), and the constraints that govern valid instances (a Patient must have exactly one date of birth). Standard languages include RDF Schema, OWL, and SKOS, with web-scale ontologies like schema.org sitting on top. In LLM systems, an ontology is what gives “knowledge” structure — without it, a knowledge graph is just a bag of triples and a structured output is just JSON with hopes attached.

Why It Matters in Production LLM and Agent Systems

LLMs hallucinate types as readily as they hallucinate facts. Ask a model to extract entities from a clinical note and it will happily emit a “Diagnosis” field with a value that is actually a treatment, or a “Medication” with a dosage where the unit is missing. An ontology turns those silent type errors into loud schema-validation failures.

The pain is sharpest in three places. Structured extraction: a contract-analysis agent emits entities that look right but cannot be loaded into the downstream graph because the relationship cardinality is wrong. RAG grounding: the retriever returns chunks tagged with ontology types (“Policy”, “Procedure”), but the LLM mixes them up in the answer and the hallucination evaluator can’t tell because both look like text. Tool calling: the function signature is the ontology — wrong type, missing required field, extra unrecognised key, all of which the gateway catches at request time only if you wired the schema in.

In 2026 agent stacks, the model-context-protocol (MCP) ecosystem has effectively standardised tool definitions as miniature ontologies. Every MCP server publishes a schema; every consuming agent validates against it. The ontology is no longer a 90s academic artefact — it is the contract that lets agents from different vendors interoperate.

How FutureAGI Uses Ontologies for Evaluation

FutureAGI does not maintain an ontology system; that is a knowledge-engineering responsibility upstream of inference. We use the ontology — or any schema you supply — as the ground truth our structured-output evaluators check against.

Concretely: a healthcare team runs an extraction agent that should emit entities matching their internal clinical ontology (Patient, Diagnosis, Medication, with typed fields and required cardinality). They define a JSON Schema derived from the ontology and attach it via Dataset.add_evaluation with SchemaCompliance plus FieldCompleteness plus TypeCompliance. Every inference is scored on three dimensions: (1) does the output validate against the schema, (2) are all required fields present, (3) do the field types match. The dashboard surfaces per-field failure rates so the team can see “Medication.dosage_unit” fails 12% of the time and target a prompt fix. If the underlying ontology evolves (new required field), versioning the dataset captures that schema change and the regression eval runs against both old and new versions.

For knowledge-graph-grounded RAG, the SourceAttribution and Groundedness evaluators check whether claims trace back to ontology-typed entities, not just text snippets — closing the gap between “the model said something plausible” and “the model said something the ontology supports”.

How to Measure or Detect It

Ontology adherence is measured against the schema layer:

fi.evals.SchemaCompliance: returns 0–1 schema-validation score with per-field reason; the canonical ontology-conformance metric.
fi.evals.JSONValidation: a stricter pass/fail check against a JSON Schema; surfaces invalid-JSON rate immediately.
fi.evals.FieldCompleteness: returns the fraction of required fields present; complements schema validation.
fi.evals.TypeCompliance: checks per-field types independent of presence; catches subtle type errors.
Per-field failure heatmap (dashboard signal): which ontology fields fail most often, broken down by route, prompt version, model.

from fi.evals import SchemaCompliance

check = SchemaCompliance()
result = check.evaluate(
    output={"diagnosis": "ICD-10:I10", "medication": "lisinopril 10mg"},
    schema={
        "diagnosis": {"type": "string", "pattern": "^ICD-10:"},
        "medication": {"type": "string", "required": True},
    },
)
print(result.score, result.reason)

Common Mistakes

Treating an ontology as a database schema. Database schemas are flat; ontologies model hierarchies, transitivity, and constraints — flatten them and you lose half the value.
Letting the LLM define the ontology. The model will produce plausible types that don’t match the downstream system. Define the ontology first, prompt to it second.
Skipping cardinality constraints. Required-field count and uniqueness are where most extraction agents fail; check them, not just types.
Versioning the data without versioning the ontology. A schema evolution silently invalidates yesterday’s eval results.
Conflating ontology coverage with answer quality. A response can be schema-valid and factually wrong; pair SchemaCompliance with Groundedness.

Frequently Asked Questions

What is an ontology?

An ontology is a formal specification of a domain's entities, relationships, hierarchies, and constraints, expressed in a machine-readable language like OWL or RDF Schema. It is the schema layer that knowledge graphs and structured-output systems are built on.

How is an ontology different from a knowledge graph?

An ontology defines the types and relationships allowed in a domain (the schema). A knowledge graph is a populated graph of actual entities and edges that conforms to that ontology. The ontology is the contract; the knowledge graph is the data.

How does FutureAGI use ontologies?

FutureAGI's SchemaCompliance and JSONValidation evaluators score LLM outputs against the schema or ontology you supply. If your agent is supposed to emit entities of type Patient with required fields, the evaluator returns a per-field validity score and surfaces drift over releases.