What Is the Six-Month Moratorium?
The March 2023 Future of Life Institute open letter calling for a six-month pause on training AI systems more powerful than GPT-4, a pivotal moment in AI policy debate.
What Is the Six-Month Moratorium?
The six-month moratorium refers to the March 2023 open letter from the Future of Life Institute, signed by Yoshua Bengio, Stuart Russell, Elon Musk, Yuval Noah Harari, and hundreds of other researchers and technologists. The letter asked AI labs to pause for at least six months any training of systems more powerful than GPT-4, arguing such systems should be built only when their effects can be managed and risks contained. No major lab paused, but the moratorium became a defining moment in modern AI policy debate.
Why It Matters in Production LLM and Agent Systems
The moratorium did not pause anything technical, but it shifted the policy and product landscape that production LLM teams now operate in. It accelerated the EU AI Act’s drafting cadence, contributed to the founding of the UK and US AI Safety Institutes, and made evaluation, red-teaming, and risk classification table-stakes for serious AI deployments. The shift in tone is what production teams feel: in 2026, “we don’t have an evaluation strategy” is no longer an acceptable answer to a board, regulator, or enterprise procurement team.
The pain of pretending the moratorium debate didn’t matter shows up across roles. A compliance lead pushing into a Tier-1 enterprise deal is asked, “show us your AI risk-management framework” and has nothing structured. A platform team ships a model upgrade without a formal eval gate and gets escalated to security review on the first incident. A product team picks a model based on benchmark scores and learns mid-rollout that the model lacks a usage-disclosure clause their region’s emerging law now requires.
In 2026, three years after the moratorium letter, the policy environment it helped shape is the operational environment every LLM team works inside. Evaluation, red-teaming, audit logs, and alignment evidence are not optional features; they are the deliverables required to ship.
How FutureAGI Handles the Post-Moratorium Reliability Stack
FutureAGI does not take a position on whether the moratorium should have happened. We build the evaluation, observability, and guardrail tooling the post-moratorium policy environment requires. At the evaluation level, BiasDetection, ContentSafety, PromptInjection, Toxicity, and Faithfulness provide concrete metrics for the abstract safety properties the moratorium asked labs to verify. At the audit level, FutureAGI’s audit-log plus Dataset versioning preserve the exact inputs, outputs, and eval scores of every production decision so a regulator inquiry can be answered with reproducible artifacts months later. At the red-team level, simulate-sdk’s Persona and Scenario plus the agentharm-safety-benchmark-aligned attack library let teams probe their systems before adversaries do.
Concretely: an enterprise AI team building under EU AI Act high-risk-system requirements wires their production agent through the Agent Command Center with pre-guardrail: PromptInjection, post-guardrail: ContentSafety, and post-guardrail: BiasDetection. They run a nightly simulate-sdk regression against an attack library spanning prompt-injection, jailbreak, and bias categories, with results stored as a versioned Dataset. When their procurement team asks “what is your AI safety infrastructure,” they hand over the FutureAGI dashboard. The moratorium did not pass into law, but the operational standard it argued for did — and the platform that meets it is the platform that closes deals.
How to Measure or Detect It
The moratorium itself isn’t measurable; the safety infrastructure it accelerated is. Wire these into evaluation:
BiasDetection— outcome disparity across demographic cohorts.ContentSafety— content-policy violations per N requests.PromptInjection— detection rate against a versioned attack library.Toxicity— toxicity score distribution.- Audit-log coverage — percentage of production decisions captured with full inputs, outputs, and eval scores.
- Red-team-pass rate — proportion of
simulate-sdkadversarial scenarios that pass guardrail thresholds.
Minimal Python — wire a content-safety regression eval:
from fi.evals import ContentSafety
cs = ContentSafety()
result = cs.evaluate(
response=model_output,
)
if result.score < 0.95:
raise QualityGateFailure("content safety regression", result.reason)
Common Mistakes
- Treating safety policy as a marketing layer. Regulators, enterprise buyers, and incident postmortems all read the actual evals, not the press release.
- Running evals once, not on every deploy. The moratorium’s whole argument was that capability moves faster than safety; eval cadence has to match deploy cadence.
- Cherry-picking benchmark scores. A model that passes MMLU is not a model that passes your domain’s harm taxonomy. Build domain-specific evals.
- No audit log. “We checked” is not a defensible answer six months after the fact. Persist inputs, outputs, eval scores.
- Skipping red-team simulation. Without
simulate-sdk-style adversarial probing, you find failure modes through customer incidents instead of regression evals.
Frequently Asked Questions
What is the six-month moratorium?
It was the March 2023 open letter from the Future of Life Institute, signed by hundreds of researchers and tech leaders, asking AI labs to pause for at least six months any training of systems more powerful than GPT-4 to allow safety research and governance to catch up.
Did the moratorium actually happen?
No. No major lab paused training. But the letter shifted public debate, accelerated EU AI Act drafting, and influenced the formation of national AI Safety Institutes in the UK and US.
How does FutureAGI relate to the moratorium debate?
FutureAGI builds the evaluation, observability, and guardrail tooling the moratorium argued AI deployment requires. Our BiasDetection, ContentSafety, and PromptInjection evaluators are concrete forms of the safety infrastructure the letter called for.