Models

What Is a Trust Service?

A third-party assurance offering, such as SOC 2's Trust Services Criteria, that issues an auditor-ready report on security, availability, integrity, confidentiality, or privacy controls.

What Is a Trust Service?

A trust service is a third-party assurance offering that produces an auditor-issued report attesting a system meets a defined set of controls. The dominant example is SOC 2, whose Trust Services Criteria (TSCs) cover security, availability, processing integrity, confidentiality, and privacy. Other examples include ISO 27001 reports, HIPAA attestations, and the newer ISO 42001 management-system standard for AI. Trust services do not measure model behaviour directly; they consume evidence that internal controls are in place. For AI platforms, that evidence increasingly includes evaluation reports, audit logs, and dataset versioning artefacts.

Why It Matters in Production LLM and Agent Systems

Trust services are the gatekeeper between an AI startup and an enterprise procurement contract. A regulated buyer — financial services, healthcare, government — will not sign a deal without a SOC 2 Type II report or equivalent. Increasingly, the same buyers are also asking for AI-specific attestations: “show me your bias-evaluation evidence”, “show me your model-card”, “how do you log decisions for audit?” This is no longer a paperwork exercise; it is a quantitative evidence ask.

Pain shows up in four places. First, audit timelines: a SOC 2 audit cycle can take 6–12 months and stalls if evidence is unstructured. Second, evidence sprawl: evaluation results live in notebooks, audit logs in S3, model cards in Notion — auditors give up. Third, AI-specific gaps: traditional SOC 2 controls do not cover hallucination, fairness, or prompt-injection risk, and auditors are now asking. Fourth, 2026-era multi-step agents: a single user request fan-outs into ten model calls and three tool invocations — which one is “the decision” the audit reviews?

Multi-step agent stacks need decision-level audit logs that connect a final outcome to its trajectory. Without that, a trust-service audit cannot reconstruct what happened. FutureAGI’s traceAI surfaces are designed exactly for this.

How FutureAGI Handles Trust-Service Evidence

FutureAGI is not a trust service — we do not issue SOC 2 reports. We are the evidence-generation layer underneath one. A team going through a SOC 2 or ISO 42001 audit can pull four kinds of artefacts from FutureAGI:

Evaluation reports. Every Dataset.add_evaluation() run produces a versioned report tying a model version to evaluator scores: Groundedness, BiasDetection, DataPrivacyCompliance, ContentSafety. Auditors love versioned, dated artefacts.

Audit logs. The Agent Command Center records every guardrail check, every routing decision, and every model fallback as a structured log entry. pre-guardrail and post-guardrail outcomes are auditable.

Trace artefacts. traceAI-langchain, traceAI-openai-agents, and other integrations capture span-level data on prompts, retrieved context, tool calls, and outputs — the multi-step decision audit trail.

Dataset versioning. fi.datasets.Dataset preserves training and evaluation data with content hashes, so an auditor can reproduce a result months later. Compared to ad-hoc spreadsheet evidence many teams hand to auditors, FutureAGI’s structured outputs cut audit-prep time noticeably. We are explicit about scope: FutureAGI improves the evidence quality that flows into a trust-service audit; the trust service itself is issued by the auditor.

How to Measure or Detect It

Audit-readiness is a checklist, not a single metric:

  • Evaluator coverage: which evaluators are wired and which TSCs they map to (DataPrivacyCompliance → privacy TSC).
  • Audit-log completeness: percentage of decisions with a captured pre-guardrail/post-guardrail log entry.
  • Trace coverage: percentage of model calls instrumented via traceAI.
  • Dataset reproducibility: percentage of evaluation runs that can be re-executed against pinned dataset versions.

Minimal Python:

from fi.evals import DataPrivacyCompliance

evaluator = DataPrivacyCompliance()
result = evaluator.evaluate(
    output="Customer SSN is 123-45-6789.",
)
# Failing this contributes to privacy-TSC evidence
print(result.score, result.reason)

The output, stored against a versioned dataset, is the audit artefact.

Common Mistakes

  • Treating trust-service prep as a pre-audit sprint. Evidence accumulates over months; collecting it in a panic produces incomplete artefacts.
  • Mapping evaluator outputs to TSCs informally. Document the mapping explicitly — auditors ask which evaluator covers which control.
  • Logging guardrail decisions without retention discipline. Audits typically require 12 months of logs; verify retention.
  • Ignoring the AI-specific gaps in classic SOC 2. Add evidence for hallucination, fairness, and prompt-injection coverage even if the standard does not require it yet — buyers do.
  • Conflating SOC 2 with ISO 42001. They cover different controls; do not assume one implies the other.

Frequently Asked Questions

What is a trust service?

A trust service is a third-party assurance offering — most often the SOC 2 Trust Services Criteria — that issues a report attesting that a system meets defined security, availability, and privacy controls.

How is a trust service different from compliance?

Compliance is the broad obligation to meet rules. A trust service is one specific, third-party-issued assurance product that evidences compliance with a defined set of criteria like SOC 2's five TSCs.

Does FutureAGI provide a trust service?

FutureAGI is not a trust service itself. It produces the evaluation evidence, audit logs, and dataset versioning that auditors use when issuing SOC 2 or ISO 42001 reports on AI systems.