Models

What Is a Ground Atom?

A predicate applied to constants only, with no variables, in first-order logic and knowledge-base systems.

What Is a Ground Atom?

A ground atom is the simplest fully-specified statement in first-order logic: a predicate applied to constants only, with no variables. parent(alice, bob) is a ground atom; parent(X, bob) is not, because X is a variable. FutureAGI treats ground atoms as structured facts that must be schema-valid and source-supported before an LLM writes them into a knowledge graph. They show up wherever symbolic structure meets neural extraction: triples, structured outputs, and entity-relation extraction from text.

Why Ground Atoms Matter in Production LLM and Agent Systems

Most LLM applications eventually need to convert free text into structured facts: a support ticket becomes assigned_to(ticket_42, agent_kara), a research paper becomes cites(paper_a, paper_b), a transaction becomes transferred(account_x, account_y, 1500_usd, 2026_05_07). Each of these is a ground atom in disguise. Treating them as ground atoms makes downstream reasoning explicit: a Datalog rule, a graph query, a compliance check, or a tool call can fire deterministically on the extracted fact rather than re-prompting the model.

The pain shows up when the extraction layer is sloppy. A model that emits parent("Alice Smith", "Bob") in one trace and parent("alice smith", "Bob Smith") in the next is producing unreconciled atoms — same fact, different constants. The downstream graph store has two nodes for one entity, joins fail, and a user-facing answer cites a fact that “doesn’t exist” because the constants were not canonicalized.

A second pain is ungrounded atoms: the model produces worked_at(jane_doe, openai) when the source text never mentioned OpenAI. The atom is structurally valid but factually fabricated. Without an extraction-time check, those hallucinated ground atoms accumulate in the knowledge graph and corrupt every downstream query.

In 2026 agent systems where planners reason over knowledge graphs (agentic-rag, llm-knowledge-graph), ground-atom quality is the trust contract. A planner that picks tool calls based on retrieved atoms will silently route around the truth if the atoms are wrong.

How FutureAGI Handles Ground-Atom Extraction

FutureAGI does not implement Datalog or first-order theorem proving. It evaluates the LLM that extracts ground atoms from text and the agent that consumes them. There are three surfaces.

Schema layer — when the model emits a structured output like {"predicate": "parent", "args": ["alice", "bob"]}, fi.evals.SchemaCompliance and fi.evals.JSONValidation check structural correctness against a JSON Schema. A failure here means the atom is malformed before semantics even apply.

Grounding layerfi.evals.Groundedness and fi.evals.SourceAttribution check whether each emitted ground atom is supported by the source context the model was given. A ground atom that is structurally valid but unsupported is flagged as a hallucination at extraction time, not after it has poisoned the graph.

Field layerfi.evals.FieldCompleteness and fi.evals.TypeCompliance ensure that arguments are constants of the correct type (not variables, not nulls, not malformed entity strings).

A real workflow: a research-paper KG team uses a traceAI-langchain-instrumented LangChain extraction agent. Each paper produces cites/2, affiliation/2, and funded_by/2 atoms. The Dataset is versioned at v9; the team runs SchemaCompliance (returns 1.0 only on valid atoms), Groundedness (returns 1.0 only when the atom appears in the cited passage), and a custom NLI evaluator. Atoms below threshold are routed to a human-annotation queue rather than the live graph. FutureAGI’s approach is honest: we do not enforce logical entailment between atoms, but we make sure the atoms that enter the graph are well-formed and source-supported.

Unlike Ragas faithfulness, which usually scores whether an answer is supported by retrieved context, this workflow scores each extracted atom as its own schema-and-grounding unit before it reaches the graph.

How to Measure Ground-Atom Quality

Ground-atom quality decomposes into structural and semantic signals:

  • fi.evals.SchemaCompliance — returns 1.0 when the atom matches the predicate-and-arity schema; flags variable leakage and arity mismatches.
  • fi.evals.JSONValidation — returns a boolean against a JSON Schema for the structured output that wraps the atom.
  • fi.evals.Groundedness — returns 0–1 grounding score; flags atoms that are not supported by the source context.
  • fi.evals.FieldCompleteness — checks that all required arguments are populated with constants, not nulls or templates.
  • Entity-canonicalization rate (dashboard signal) — fraction of atoms whose constants resolve to a single canonical entity ID rather than a free-text string.
from fi.evals import SchemaCompliance, Groundedness

atom = {"predicate": "parent", "args": ["alice", "bob"]}
schema = SchemaCompliance().evaluate(input=atom)
grounded = Groundedness().evaluate(
    input="Who is Bob's parent?",
    output="alice is bob's parent",
    context="Alice gave birth to Bob in 1992.",
)
print(schema, grounded)

Common mistakes

  • Allowing free-text constants without canonicalization. parent("Alice Smith", "Bob") and parent("alice smith", "Bob Smith") are different atoms to a graph store.
  • Skipping grounding checks at extraction time. Once a fabricated atom is in the graph, every downstream query is contaminated.
  • Conflating ground atoms with ground truth. A ground atom is a syntactic category; ground truth is an evaluator’s reference dataset. Don’t mix the terms in dashboards.
  • Ignoring arity. The model may emit parent(alice, bob, 1992) when the predicate is binary; SchemaCompliance catches this only if the schema specifies arity.
  • Treating LLM extraction as a one-shot operation. Re-extract on schema or source updates; old atoms become stale ground.

Frequently Asked Questions

What is a ground atom?

A ground atom is a formula in first-order logic of the form predicate(c1, c2, ...) where every argument is a constant rather than a variable. It represents a fully-instantiated fact.

How is a ground atom different from a regular atom?

A regular atom can contain free variables (e.g., parent(X, bob)). A ground atom has no variables — every argument is bound to a constant — so it asserts or denies a specific fact.

How does FutureAGI evaluate ground-atom extraction?

When an LLM extracts ground atoms or knowledge-graph triples, FutureAGI's SchemaCompliance and JSONValidation evaluators check structural correctness, and Groundedness verifies that each extracted fact is supported by the source context.