Block hallucinations-
before they reach your users

18+ guardrail types running pre and post-processing. PII detection, prompt injection defense, hallucination checks, content moderation, secret detection, topic restriction, and custom rules. Enforce, monitor, or log - per rule. Sub-100ms inline. 8+ external integrations.

Guard Pipeline Active
4 rules Policy: Production Safety
User Input
"My email is alice@example.com and my SSN is 123-45-6789. What's your refund policy?"
PII Detection EMAIL, SSN detected REDACTED 12ms
Prompt Injection score: 0.08 PASS 8ms
Content Moderation score: 0.02 PASS 6ms
Secret Detection score: 0.00 PASS 4ms
Sanitized Input → LLM
"My email is [REDACTED] and my SSN is [REDACTED]. What's your refund policy?"
Total pipeline: 30ms 4/4 rules checked · 1 redaction · 0 blocked

Protect outperforms GPT-4.1
on every guardrail category

Based on Gemma 3n with LoRA fine-tuned adapters. Four specialized models for toxicity, sexism, data privacy, and prompt injection. Open-source text adapters on HuggingFace.

Prompt Injection +11.7% vs GPT-4.1
Protect 97.2%
WildGuard 88.5%
GPT-4.1 85.5%
LlamaGuard-4 83.0%
Toxicity 97.5% accuracy
Protect 97.5%
GPT-4.1 97.4%
WildGuard 94.0%
LlamaGuard-4 90.6%
Sexism +2.1% vs GPT-4.1
Protect 95.0%
GPT-4.1 92.9%
WildGuard 92.1%
LlamaGuard-4 63.4%
Data Privacy 85.7% accuracy
Protect 85.7%
GPT-4.1 85.1%
LlamaGuard-4 78.2%
WildGuard 73.3%
~67ms
Mean latency (text)
p50: 65ms · p95: 74ms
~109ms
Mean latency (image)
p50: 107ms · p95: 120ms
3
Modalities
Text · Image · Audio
Based on Gemma 3n (E4B) with LoRA fine-tuned adapters. Open-source text adapters on HuggingFace.
Read the paper
Core Features

Everything you need to
stop AI hallucinations

Real-time Guard
User Input
"What's your refund policy?"
Guard checking response...
Hallucination Detected
BLOCKED
"We offer a 60-day money back guarantee..."
Corrected Response
PASSED
"We offer a 30-day refund policy for all purchases..."
Latency: 23ms
Confidence: 98.2%

PII detection, prompt injection, content moderation, secret detection, hallucination checks, topic restriction, language detection, data leakage prevention, custom blocklists, system prompt protection, tool permissions, input validation, MCP security, custom expression rules, and webhook-based BYOG. Each runs pre-processing (before the LLM) or post-processing (before the user) - or both.

See all guard types

Each guardrail rule has its own enforcement mode. Enforce blocks the request (403). Monitor lets it through but logs a warning. Log records silently. Confidence scores from 0.0 (safe) to 1.0 (violation) with configurable thresholds - 0.3 for strict, 0.5 for balanced, 0.8 for obvious-only. Start in Monitor mode, graduate to Enforce when confident.

Configure enforcement

When PII is detected (emails, SSNs, credit cards, phone numbers, addresses), choose how to handle it. Block rejects the request. Mask replaces with asterisks (alice@***.com). Redact removes entirely ([REDACTED]). Hash replaces with a consistent hash (#a1b2c3d4). Sanitize sensitive data while keeping the request flowing.

Learn about PII handling

Plug in leading guardrail providers alongside built-in checks. Lakera Guard for PII and injection. Presidio for detection and redaction. Llama Guard for content moderation. AWS Bedrock Guardrails for multi-modal safety. Azure Content Safety, Pangea, Aporia, Enkrypt AI, HiddenLayer, DynamoAI, and more. Or bring your own via webhooks.

View integrations
How It Works

Go from zero to protected
in three steps

Guard Policy 4 rules
PII Detection enforce
Prompt Injection monitor
Content Moderation enforce
Secret Detection enforce

Create a guardrail policy

Stack rules together - PII detection, prompt injection, content moderation, secret detection. Set enforcement mode (enforce/monitor/log) and confidence thresholds per rule. Configure via dashboard or SDK.

Scope Active
✓ Gateway (all traffic)
○ Per project
○ Per API key
Pre + Post · Streaming supported

Apply to your traffic

Scope policies globally, per project, or per API key. All traffic through the gateway is automatically checked. Pre-processing guards run before the LLM, post-processing guards run before the user. Works with streaming.

Guard Monitor Live
Check passed 12ms
Check passed 8ms
PII Redacted 15ms

Monitor and tune

Track every check - what was blocked, what was flagged, what passed. See confidence scores, violation types, and trends. Submit feedback on false positives to improve detection. Graduate from Monitor to Enforce when confident.

Powering teams from
prototype to production

From ambitious startups to global enterprises, teams trust Future AGI to ship AI agents confidently.