What Is Entity Extraction?
The task of identifying and extracting structured items such as names, dates, organizations, and amounts from unstructured text.
What Is Entity Extraction?
Entity extraction is the task of pulling structured fields out of unstructured text — names, organizations, dates, amounts, product IDs, addresses, anything a downstream system needs as typed data. Classical NER models tag tokens with categories like PERSON or ORG. LLM-based extractors generate JSON against a schema and rely on guided decoding or function calling. Entity extraction sits inside RAG indexing, agent planners, customer-service routers, and PII redaction. In a FutureAGI trace, an extraction shows up as an LLM span with a JSON output that gets scored against the expected schema.
Why entity extraction matters in production LLM and agent systems
Bad extraction is a silent failure. The extracted JSON looks well-formed, the API accepts it, and the downstream system runs — only with the wrong amount, the wrong customer ID, or a missing date. The pain reaches several roles. A backend engineer debugs why an order-management workflow auto-cancelled the wrong shipment. A finance team chases a phantom $4,200 charge that the LLM extracted as “$4.2K” but the parser read as “$4.2”. A compliance lead discovers the extractor pulled patient names into a logs index because the schema let it.
The common production symptoms are field-level: missing required keys, hallucinated values not present in source, incorrect types (string vs. number, ISO date vs. free text), or right-value-wrong-field swaps. End-to-end accuracy hides all of them. A trace that says “extraction_succeeded=true” because the JSON parsed is meaningless if half the fields are wrong.
In 2026-era agent stacks, extraction is rarely a one-shot call. A planner extracts intent, a tool extracts parameters, a retriever extracts citations, a critic extracts disagreements. A 5% per-step error compounds across five steps. Multi-step pipelines need step-level extraction evaluators wired to OTel spans so you see which step lost the field, not just that the final JSON looks off.
How FutureAGI handles entity extraction
FutureAGI’s approach is to score extraction at three levels. Syntactic correctness: JSONValidation and IsJson confirm the output parses against a JSON Schema. Field-level coverage: FieldCompleteness returns a 0–1 score for whether required fields are populated; FieldCoverage compares response fields to expected output. Semantic correctness: Groundedness and FactualConsistency check whether the extracted values are supported by the source text rather than hallucinated. For PII-heavy domains, the PII evaluator flags unintended extraction of protected identifiers, and a pre-guardrail on Agent Command Center can redact PII from the prompt before extraction runs.
A practical pattern: a logistics team runs an LLM extractor on shipping emails to populate a tracking schema. They instrument the chain with traceAI-langchain, attach JSONValidation, FieldCompleteness, and Groundedness per response, and dashboard eval-fail-rate-by-cohort sliced by carrier and email template. When fail rate spikes for one carrier, the trace view points to a new email layout that buries the tracking number in the footer; they update the prompt and rerun a regression eval against the canonical golden dataset before redeploy. Unlike spaCy NER on its own, the extraction pipeline now has step-level evaluators, schema validation, and a regression history.
How to measure entity extraction
JSONValidation: validates output against a JSON Schema; the canonical syntactic gate.FieldCompleteness: returns 0–1 for whether required fields are filled; surfaces silent omissions.Groundedness: checks whether each extracted value is supported by the source text; catches hallucinated fields.PII: flags personal data the schema did not authorize.FuzzyMatchorEquals: compares extracted values against ground-truth labels for batch evaluation.- eval-fail-rate-by-cohort (dashboard signal): per-template, per-locale, or per-carrier failure rates point to where the prompt or schema is fragile.
Minimal Python:
from fi.evals import JSONValidation, FieldCompleteness
schema_check = JSONValidation(schema=order_schema)
fields = FieldCompleteness(required=["customer_id", "amount", "date"])
result = schema_check.evaluate(output=llm_json)
print(result.score, fields.evaluate(output=llm_json).score)
Common mistakes
- Trusting JSON-mode parsing as a quality signal. A parsable JSON with the wrong values is still a failure; pair with field-level checks.
- One schema for every locale. Date formats, name orders, and address structures differ; locale-aware schemas reduce false negatives.
- Skipping groundedness checks. An LLM will happily invent a customer ID that “fits the pattern”; verify each value against source text.
- Ignoring PII surface area. Extractors often pull more than the schema asks for; redact upstream and audit downstream.
- Treating partial extraction as success. A 90% field-completion rate hides the 10% of records that downstream systems will silently corrupt.
Frequently Asked Questions
What is entity extraction?
Entity extraction is the task of identifying and pulling out structured items — names, organizations, dates, amounts, IDs — from unstructured text into typed fields a downstream system can use.
How is entity extraction different from named entity recognition (NER)?
NER is the classical sequence-labeling form of entity extraction with a fixed tag set like PERSON, ORG, DATE. Entity extraction is broader and includes LLM-driven schema-based extraction of arbitrary domain fields.
How do you measure entity extraction quality?
FutureAGI scores extraction with JSONValidation for schema correctness, FieldCompleteness for required-field coverage, and PII to catch unintended personal-data extraction.