How is a ground atom different from a regular atom?

A regular atom can contain free variables (e.g., parent(X, bob)). A ground atom has no variables — every argument is bound to a constant — so it asserts or denies a specific fact.

How does FutureAGI evaluate ground-atom extraction?

When an LLM extracts ground atoms or knowledge-graph triples, FutureAGI's SchemaCompliance and JSONValidation evaluators check structural correctness, and Groundedness verifies that each extracted fact is supported by the source context.

Ground Atom Definition & FutureAGI Guide (2026)

Q: What is a ground atom?

A ground atom is a formula in first-order logic of the form predicate(c1, c2, ...) where every argument is a constant rather than a variable. It represents a fully-instantiated fact.

What Is a Ground Atom?

A ground atom is the simplest fully-specified statement in first-order logic: a predicate applied to constants only, with no variables. parent(alice, bob) is a ground atom; parent(X, bob) is not, because X is a variable. FutureAGI treats ground atoms as structured facts that must be schema-valid and source-supported before an LLM writes them into a knowledge graph. They show up wherever symbolic structure meets neural extraction: triples, structured outputs, and entity-relation extraction from text.

Why Ground Atoms Matter in Production LLM and Agent Systems

Most LLM applications eventually need to convert free text into structured facts: a support ticket becomes assigned_to(ticket_42, agent_kara), a research paper becomes cites(paper_a, paper_b), a transaction becomes transferred(account_x, account_y, 1500_usd, 2026_05_07). Each of these is a ground atom in disguise. Treating them as ground atoms makes downstream reasoning explicit: a Datalog rule, a graph query, a compliance check, or a tool call can fire deterministically on the extracted fact rather than re-prompting the model.

The pain shows up when the extraction layer is sloppy. A model that emits parent("Alice Smith", "Bob") in one trace and parent("alice smith", "Bob Smith") in the next is producing unreconciled atoms — same fact, different constants. The downstream graph store has two nodes for one entity, joins fail, and a user-facing answer cites a fact that “doesn’t exist” because the constants were not canonicalized.

A second pain is ungrounded atoms: the model produces worked_at(jane_doe, openai) when the source text never mentioned OpenAI. The atom is structurally valid but factually fabricated. Without an extraction-time check, those hallucinated ground atoms accumulate in the knowledge graph and corrupt every downstream query.

In 2026 agent systems where planners reason over knowledge graphs (agentic-rag, llm-knowledge-graph), ground-atom quality is the trust contract. A planner that picks tool calls based on retrieved atoms will silently route around the truth if the atoms are wrong.

How FutureAGI Handles Ground-Atom Extraction

FutureAGI does not implement Datalog or first-order theorem proving. It evaluates the LLM that extracts ground atoms from text and the agent that consumes them. There are three surfaces.

Schema layer — when the model emits a structured output like {"predicate": "parent", "args": ["alice", "bob"]}, fi.evals.SchemaCompliance and fi.evals.JSONValidation check structural correctness against a JSON Schema. A failure here means the atom is malformed before semantics even apply.

Grounding layer — fi.evals.Groundedness and fi.evals.SourceAttribution check whether each emitted ground atom is supported by the source context the model was given. A ground atom that is structurally valid but unsupported is flagged as a hallucination at extraction time, not after it has poisoned the graph.

Field layer — fi.evals.FieldCompleteness and fi.evals.TypeCompliance ensure that arguments are constants of the correct type (not variables, not nulls, not malformed entity strings).

A real workflow: a research-paper KG team uses a traceAI-langchain-instrumented LangChain extraction agent. Each paper produces cites/2, affiliation/2, and funded_by/2 atoms. The Dataset is versioned at v9; the team runs SchemaCompliance (returns 1.0 only on valid atoms), Groundedness (returns 1.0 only when the atom appears in the cited passage), and a custom NLI evaluator. Atoms below threshold are routed to a human-annotation queue rather than the live graph. FutureAGI’s approach is honest: we do not enforce logical entailment between atoms, but we make sure the atoms that enter the graph are well-formed and source-supported.

Unlike Ragas faithfulness, which usually scores whether an answer is supported by retrieved context, this workflow scores each extracted atom as its own schema-and-grounding unit before it reaches the graph.

How to Measure Ground-Atom Quality

Ground-atom quality decomposes into structural and semantic signals:

fi.evals.SchemaCompliance — returns 1.0 when the atom matches the predicate-and-arity schema; flags variable leakage and arity mismatches.
fi.evals.JSONValidation — returns a boolean against a JSON Schema for the structured output that wraps the atom.
fi.evals.Groundedness — returns 0–1 grounding score; flags atoms that are not supported by the source context.
fi.evals.FieldCompleteness — checks that all required arguments are populated with constants, not nulls or templates.
Entity-canonicalization rate (dashboard signal) — fraction of atoms whose constants resolve to a single canonical entity ID rather than a free-text string.

from fi.evals import SchemaCompliance, Groundedness

atom = {"predicate": "parent", "args": ["alice", "bob"]}
schema = SchemaCompliance().evaluate(input=atom)
grounded = Groundedness().evaluate(
    input="Who is Bob's parent?",
    output="alice is bob's parent",
    context="Alice gave birth to Bob in 1992.",
)
print(schema, grounded)

Common mistakes

Allowing free-text constants without canonicalization. parent("Alice Smith", "Bob") and parent("alice smith", "Bob Smith") are different atoms to a graph store.
Skipping grounding checks at extraction time. Once a fabricated atom is in the graph, every downstream query is contaminated.
Conflating ground atoms with ground truth. A ground atom is a syntactic category; ground truth is an evaluator’s reference dataset. Don’t mix the terms in dashboards.
Ignoring arity. The model may emit parent(alice, bob, 1992) when the predicate is binary; SchemaCompliance catches this only if the schema specifies arity.
Treating LLM extraction as a one-shot operation. Re-extract on schema or source updates; old atoms become stale ground.