Compliance

What Is PII (Personally Identifiable Information)?

Data that can identify a specific individual either directly or in combination with other available information, requiring detection and redaction in LLM pipelines.

What Is PII (Personally Identifiable Information)?

PII is any data that identifies a specific person, either on its own or in combination with other accessible information. Direct identifiers include full name, government IDs, email, phone number, and biometric data. Indirect identifiers. IP address, device ID, ZIP code plus date of birth, or rare job-title-plus-employer combinations. become PII the moment they can re-identify someone. In LLM pipelines, PII flows through user prompts, retrieved context windows, tool outputs, and generated responses. Because language models memorize and recombine, PII detection has to happen at every model boundary, synchronously, with audit-grade logs.

The 2026 update to the textbook definition is that “indirect” has gotten broader. With frontier models like GPT-5.x, Claude Opus 4.7, and Gemini 3.x trivially correlating four or five weak signals into a re-identification, the threshold for “identifies a specific person” has dropped. A ZIP code plus an unusual employer title plus a recent project name is now PII in the same operational sense that an SSN is. Treat the term as a function of model capability, not a fixed taxonomy.

Why PII matters in production LLM and agent systems

A leaked PII record is the failure mode regulators read first. Under GDPR, a single confirmed leak triggers a 72-hour breach notification; under HIPAA, the threshold is a single record. The EU AI Act, enforceable since August 2026, layers an additional duty: any high-risk system that processes personal data without documented controls is presumed non-compliant. The fines scale with affected users, but the reputational hit lands the same day the incident becomes public.

The cost of a PII leak in an LLM application is also asymmetric in a way it is not in a classical web app. A leaked record from a database affects the one record. A leaked record echoed by a model that has been trained or RAG-grounded on a corpus can echo again across every future session that touches a similar query. Re-identification compounds: the same fragment paired with two innocuous follow-up questions can recover the original record. This is why the 2026 regulator focus has moved from “did you have a leak?” to “how would you know if you had a leak, and how fast can you stop it?”. both questions answered by inline detection, not batch review.

The leak surfaces are subtle in agent stacks. A RAG chain pulls a CRM ticket into context; the model summarizes it and returns the customer’s phone number to a different user. An agent calls a lookup_user tool with a partial name and the tool returns a 50-row table; the model echoes one of the rows in its next tool call. A fine-tuned model trained on chat logs regenerates a snippet of someone’s email address verbatim. Cross-session leakage. user A getting fragments of user B’s data. is not a hypothetical risk in 2026 production; it is something every observability team has seen in agent trajectory reviews.

The MCP era has made the surface larger. An MCP server that exposes “search internal docs” to an agent now hands the model raw context that the application never intended to ship to the user. Indirect PII leak through tool outputs is the dominant 2026 incident class, not direct echo of user input.

Roles affected: legal asks for the audit log, security asks for the detection coverage, product asks why the model is blocked, support fields the user complaint. Engineering owns all four answers. The right architectural answer is to push PII detection out of application code and into the LLM gateway, where every route inherits the same policy.

What 2026 regulators actually ask for

The audit-time question has shifted between 2023 and 2026. Three years ago, regulators asked “do you have a PII policy?” Today, a GDPR Art. 30 record-of-processing request or a HIPAA breach review asks for the request ID, the evaluator score, the policy version, the action taken (block / redact / allow), the model and prompt versions, and the retention timestamp. all stitched into one evidence chain. The EU AI Act audit framework, formalised in 2026 secondary legislation, asks for the same thing for high-risk systems plus a documented AI risk assessment and red-team coverage. A team without a unified trace-and-evaluator store ends up reconstructing this evidence by hand from five systems, which is the operational pain that drives buy decisions in 2026.

PII is also the entry point for downstream programs: data privacy, HIPAA compliance for health workflows, PCI-DSS for payment flows, FERPA for education data, GLBA for financial services, and the EU AI Act’s Art. 10 data-governance requirements. A PII gap in the gateway becomes a five-framework gap downstream.

How FutureAGI handles PII

FutureAGI ships two evaluators that anchor a PII program: PII and DataPrivacyCompliance. The PII evaluator targets specific identifier classes. names, emails, phone numbers, SSNs, addresses, credit-card numbers, MRNs, IBANs. and returns Pass/Fail with a reason naming the offending span. DataPrivacyCompliance is broader: it scores whether the output as a whole aligns with GDPR for LLMs, HIPAA, CCPA, and PCI-DSS rubrics, useful when you want a single signal rather than a per-class result. For health-data workflows we additionally recommend pairing with ClinicallyInappropriateTone and content safety checks.

Both run inside Agent Command Center as pre-guardrail and post-guardrail stages. A common configuration: pre-guardrail: [PII] strips identifiers from user inputs before the model sees them; post-guardrail: [PII, DataPrivacyCompliance] catches model outputs that surface PII from the context window. On a Failed result the gateway can block, redact (replace tokens with deterministic placeholders that preserve referential integrity within a session), or escalate to a human-in-the-loop queue.

FutureAGI’s approach to PII in 2026 is different from a pure detector like Microsoft Presidio or AWS Comprehend. Those tools score text in isolation. FutureAGI scores text in the context of the agent trajectory: a name in an isolated prompt is low-risk, but the same name appearing in a post-guardrail stage after the model accessed a customer record is high-risk. The agent.trajectory.step attribute carried by traceAI gives the evaluator the trajectory signal it needs.

Engineering then closes the loop. Every guardrail decision writes to the audit log with the request ID, evaluator, decision, reason, and policy version. The same fi.evals evaluators run offline against a regression dataset of known-PII and known-clean prompts so you can confirm precision and recall before flipping a policy change. When a customer asks “do you process my SSN?”, the answer is a query against the audit-log store, not a meeting.

Where PII lives in an LLM trace

A PII program needs to know every span where PII can enter or exit the model. In a typical 2026 agent stack instrumented with traceAI, those spans are:

  • gen_ai.user.message. user input. Pre-guardrail target.
  • retrieval.documents. retrieved chunks from RAG. PII often slips in through old CRM records, support tickets, or untriaged uploads.
  • tool.output. function-call results from internal APIs, MCP servers, or peer agents. The dominant indirect vector.
  • agent.memory.read. recalled long-term agent memory. User-A memory should never surface to user-B.
  • gen_ai.assistant.message. the model’s response. Post-guardrail target.
  • agent.handoff.message. text passed between sub-agents. Often skipped from review.
  • log.body. observability log lines that captured prompts or responses verbatim. In-scope for the same regs as the model itself.

The PII evaluator should attach to all seven, not just the first and last. We see most missed-leak incidents originate at tool.output or agent.handoff.message precisely because they look like “internal” data flow.

PII identifier classes and 2026 detection bars

Identifier classDetection method (2022)Detection method (2026)Recall targetNotes
SSN, EIN, TINRegexRegex + PII judge0.99+Format is fixed; cheap to nail
Credit card (PAN)Regex + LuhnRegex + Luhn + PII judge0.99+PCI-DSS requires log-level redaction
EmailRegexPII judge (handles obfuscation)0.97j[dot]doe[at]ex defeats regex
Phone numberRegex per localePII judge0.95International formats are messy
Full nameNER (Presidio)PII judge with NER fallback0.92Hardest class. context-dependent
AddressNERPII judge0.90Partial addresses still re-identify
IP addressRegexRegex (GDPR treats as personal data)0.99Combine with timestamp = PII
MRN / health IDRegex per systemPII judge0.97HIPAA scope; pair with PHI rubric
Biometric vectorManual schema flagSchema flag at gateway1.00Detect at structured input layer
Indirect PII (job + ZIP + date)None practicalDataPrivacyCompliance LLM judge0.85Recall < 1.0 is acceptable here

How to detect PII in LLM pipelines

PII detection health is measured at the cohort level, not the request level:

  • PII evaluator failure-rate. fraction of inputs or outputs containing detected identifiers, broken down by route, identifier class, and agent.trajectory.step.
  • Precision and recall against a labeled regression set. recall above 0.97 for SSN/credit-card classes is the 2026 bar; precision above 0.90 to keep false redactions tolerable. Track per-class, not aggregate.
  • Redaction latency. added p99 for the pre-guardrail and post-guardrail stages; budget 30–120 ms for a judge-model guardrail, < 10 ms for a regex prefilter.
  • Audit-log completeness. every blocked or redacted request logged with reason, evaluator version, policy version, and request ID; missing entries are compliance gaps.
  • Cross-session leakage probes. synthetic test cases where user A’s PII is in shared context and user B’s request should never surface it. Run weekly against a fixed golden dataset.
  • MCP tool-output coverage. every MCP server response a model can see should pass through the PII evaluator before re-entering the planner.

Minimal pairing snippet:

from fi.evals import PII, DataPrivacyCompliance

pii = PII()
privacy = DataPrivacyCompliance(framework="gdpr")

for trace in production_sample:
    pii_in = pii.evaluate(input=trace.user_message)
    pii_out = pii.evaluate(input=trace.assistant_message)
    privacy_out = privacy.evaluate(output=trace.assistant_message)
    if any(r.score == "Failed" for r in (pii_in, pii_out, privacy_out)):
        trace.tag("pii_review")

In our 2026 evals, the most valuable signal is not the aggregate fail-rate but the per-class precision/recall over a 14-day rolling window. A drop in email recall almost always indicates a new obfuscation pattern in user traffic; a drop in name precision usually means a benign celebrity reference is tripping the judge. For external calibration, the BeaverTails safety corpus (~333K labeled QA pairs, 14 harm categories including privacy violations) and Gray Swan’s AgentHarm (110 agentic-harm prompts spanning PII exfiltration, doxxing, and credential leakage) are the standard public anchors. frontier-model PII-leakage rates on AgentHarm sit in the 8–22% band before guardrails, dropping below 2% once a judge-model PII evaluator is in front of the gateway.

Operational benchmarks: latency, throughput, and false-positive budget

In 2026 production, a PII guardrail has three operational constraints. The first is latency: a judge-model PII evaluator typically runs in 80–180 ms; a regex prefilter in < 5 ms. We recommend a two-stage architecture. fast regex prefilter, then a judge model only for routes where false negatives are catastrophic. The second is throughput: at peak, an enterprise gateway might score 5–20k inputs per second. A 100 ms-per-call judge cannot keep up alone; it must be batched, cached, or short-circuited by the prefilter. The third is the false-positive budget: false redactions degrade user experience, false blocks generate support tickets. Our default is to budget 0.5–2% false-positive rate per identifier class and to alert when the rate breaks budget for three consecutive hours, which usually indicates either a model regression or a new input pattern worth a manual review.

Building the regression dataset that backs the gateway

A PII regression dataset is the single asset that determines whether your gateway holds up over time. Three properties matter: coverage, freshness, and label quality.

  • Coverage: every identifier class in your traffic, plus the three indirect-PII patterns most common in your domain. For a healthcare app: MRN, NPI, date-of-service, plus rare-condition + ZIP combinations. For a fintech app: account number, routing number, transaction ID + amount + merchant.
  • Freshness: 5–10% of rows refreshed monthly from sampled production traces, triaged with LLM-as-a-judge and human review. Static datasets go stale within a quarter as user phrasing drifts.
  • Label quality: dual-annotator pass with arbitration. Single-annotator labels carry 8–12% disagreement on free-text PII, which floors your achievable precision/recall at the same rate.

The dataset lives in fi.datasets.Dataset, versioned alongside the policy. Every policy change runs the eval first and ships only if recall holds and precision moves within budget.

Common mistakes (May 2026 edition)

  • Regex-only detection. Regex catches well-formed SSNs and credit cards; it misses names, free-text addresses, indirect identifiers, and obfuscated inputs (j dot doe at ex dot com). Pair regex with the PII judge-model evaluator. This is the single biggest gap we see in 2026 audits.
  • Detecting PII in inputs but not outputs. Most leaks are in the model’s response, not the user’s prompt. context-window content surfaces in unexpected ways. Run post-guardrail: [PII] on every customer-facing route.
  • Ignoring tool outputs as a PII source. MCP server responses, agent tool results, and RAG retrieval are 2026’s dominant indirect-PII vectors. Treat every external boundary as in-scope.
  • Logging full prompts to a non-isolated store. Your observability platform now holds raw PII; treat it as in-scope for the same regulations. Trace storage needs the same redaction policy as production logs.
  • Treating IP and timestamp as non-PII. Under GDPR, an IP address is personal data. Default to “if combined with anything else it identifies a person, it is PII.”
  • No regression suite for the PII evaluator itself. Detector quality drifts as input distributions shift; pin a labeled dataset and re-run weekly via LLM regression testing.
  • Confusing redaction with deletion. A redacted record is still in scope for “right to be forgotten” until the underlying log line is purged. Build a deletion pipeline, not just a redaction one.
  • Skipping fairness checks on the redactor. A PII judge that under-detects names from non-English locales is a discrimination risk. Slice precision/recall by locale and audit quarterly.
  • No backpressure between detection and the model. If your pre-guardrail fires async and the model runs anyway, you have a logging system, not a guardrail. The block decision must complete before the inference call.
  • One PII evaluator across regulated and unregulated routes. Marketing copy and clinical advice have different acceptable identifiers (a marketing list can include an email; a clinical summary cannot include an MRN without consent). Per-route policy is the 2026 default.
  • Failing to revoke session-scoped redaction placeholders. A redacted token like [PERSON_4] that persists across sessions becomes a re-identification vector against the audit log itself. Scope placeholders to the session and rotate the salt.
  • Treating fine-tuned-model PII memorization as someone else’s problem. If you fine-tuned on chat logs, your weights now contain PII. Run PII against canary prompts on the fine-tuned model and treat positive hits as a data-deletion-pipeline trigger.

Frequently Asked Questions

What is PII?

PII is any information that can identify a person directly (name, SSN, email) or indirectly when combined with other data (IP plus timestamp, or location plus job title). In LLM systems, PII enters via user input, retrieved context, or tool output.

How is PII different from PHI?

PHI (protected health information) is the HIPAA-specific subset of PII tied to a person's health status, treatment, or payment. All PHI is PII, but PII like a marketing email is not PHI. HIPAA imposes stricter handling than general privacy law.

How do you detect PII in LLM outputs?

FutureAGI's PII evaluator runs as a post-guardrail in Agent Command Center, returning Passed or Failed plus a reason. Failed responses can be auto-redacted, blocked, or escalated before they reach the user.