What Is Invalid JSON Output (LLM)?
An LLM failure mode in which model output that was expected to be JSON cannot be parsed due to syntactic errors.
What Is Invalid JSON Output?
Invalid JSON is a structured-output failure mode where an LLM response that was supposed to be JSON cannot be parsed. The model wraps the output in a Markdown code fence, adds a “Sure, here’s your JSON:” preamble, forgets to escape an inner quote, or truncates mid-object. The response is unusable to any downstream parser. Because most LLM apps in 2026 pipe model output into business logic, an invalid-JSON response causes an immediate exception — it is the first gate every structured-output pipeline needs before schema validation, business rules, or tool execution can run.
Why It Matters in Production LLM and Agent Systems
On 2026-04-29 a coding-agent feature at a developer-tools startup hard-failed at 11pm on a Friday. Postmortem: the planner had been prompted to return a JSON tool-call object. After a model provider rolled out a new fine-tune, the model started prefixing every response with three backticks and the word “json”. The downstream parser threw, the agent’s catch handler retried the same broken prompt, the retry budget burned out, and the user saw “agent unavailable.” Token spend tripled overnight from the retry storm. No JSON-validity eval was in the loop.
That is the invalid-JSON failure mode. It hits the application engineer (parser exceptions), the SRE (latency and cost spikes from retries), and the end user (silent feature failure). It is especially common at model-version boundaries — every provider quietly changes formatting habits between fine-tunes, and what used to be clean JSON becomes JSON wrapped in prose.
In agentic systems the failure compounds. Tool calls require structured arguments. A planner that emits invalid JSON for a tool’s parameters either crashes the executor or — worse — produces a partially-parsed argument set that runs the tool with garbage inputs. Function calling APIs (tools parameter in OpenAI, Anthropic, Google) reduce but do not eliminate the failure: structured outputs still misformat under adversarial inputs, long contexts, and edge cases. A 2026 production agent stack needs invalid-JSON detection between every model call and the next tool execution, full stop.
How FutureAGI Handles Invalid JSON
FutureAGI’s approach is two evaluators wired as a post-guardrail. fi.evals.IsJson (cloud template, Pass/Fail) is the canonical check — input is a string, output is whether the string parses as valid JSON. fi.evals.JSONSyntaxOnly (local metric) is the lightweight variant for high-throughput traces where you need pure syntax validation in microseconds. Both feed into the Agent Command Center as a post-guardrail policy: if the model returns invalid JSON, the gateway can trigger a retry policy with a stricter prompt or fall back to a more reliable model via the model-fallback primitive — without the application code ever seeing the broken response.
Concretely: a structured-extraction service is wrapped behind the Agent Command Center. The post-guardrail runs IsJson on every response. If the check fails, the gateway applies a configured retry policy (max 2 retries with a “JSON-only, no prose” prompt suffix) and on third failure routes to a fallback model. Every event is written to the trace with traceAI-openai. The dashboard plots is_json_pass_rate by model and prompt version. When a provider fine-tune drops the rate from 99.7% to 94.2%, the team sees it within minutes — not on Friday night during a customer outage.
Unlike Instructor’s runtime validation library which lives in application code, FutureAGI’s IsJson runs at the gateway layer, so the broken response is contained before the parser ever sees it.
How to Measure or Detect It
Signals to wire up:
fi.evals.IsJson— Pass/Fail; canonical syntactic JSON check.fi.evals.JSONSyntaxOnly— local-metric high-throughput variant.- OTel attribute
llm.output.value— the raw output string the evaluator scores. - Dashboard signal:
is_json_pass_rateby model + prompt version — drift indicator. - Retry-rate metric — high
retry-on-invalid-jsoncount is a leading indicator of a model regression.
from fi.evals import IsJson
evaluator = IsJson()
result = evaluator.evaluate(
text='```json\n{"name": "Alice", "age": 30}\n```'
)
print(result.score, result.reason)
Common Mistakes
- Confusing invalid JSON with schema-validation failure. Invalid JSON is unparseable; schema failure is parseable but structurally wrong. Run
IsJsonfirst, thenJsonSchema. - Trying to repair broken JSON in application code with regex. This works until it doesn’t; use a gateway-level retry with a stricter prompt instead.
- Skipping the eval when you use function-calling APIs. Even native structured outputs misformat under long contexts and adversarial inputs.
- Allowing prose preambles by accepting “extract the JSON between the braces.” That works on average but fails when nested objects appear; tighten the prompt and the eval.
- Logging the broken JSON without redaction. PII can leak into error logs unfiltered.
Frequently Asked Questions
What is invalid JSON output?
Invalid JSON is LLM output that was expected to be JSON but cannot be parsed because of syntactic errors — missing commas, unescaped quotes, code-fence wrappers, or extra prose.
How is invalid JSON different from schema-validation failure?
Invalid JSON cannot be parsed at all — it is syntactically broken. Schema-validation failure is parseable JSON that does not match the expected structure. Invalid JSON is checked first, then schema.
How do you detect invalid JSON?
FutureAGI's fi.evals IsJson evaluator returns Pass if the response is parseable JSON, Fail otherwise. Use JSONSyntaxOnly for the lightweight syntax-only variant on every structured-output trace.