Agents

What Is an Agent Loop?

The reason-act-observe cycle that an agent runtime executes per step until a goal or stop condition is met.

What Is an Agent Loop?

An agent loop is the reason–act–observe cycle that drives every step of an AI agent. The model reasons about the goal given the current state. The runtime executes the chosen action. usually a tool call, retrieval, or sub-agent invocation. The observed result is fed back into the model. The loop iterates until a termination condition fires: the goal is reached, a max-iteration cap is hit, or the agent emits an explicit stop. In a FutureAGI trace, one pass through the loop is a trio of spans: the LLM-reasoning span, the action span, and the observation update. As of May 2026, frontier reasoning models. GPT-5.x, Claude Opus 4.7, Gemini 3 Pro. make the loop richer per iteration but also more expensive when it runs away.

Why It Matters in Production LLM and Agent Systems

The loop is where agents spend time and money. A free-form loop with no upper bound and no progress signal is the canonical 2026 incident: one user request produces 60 LLM calls and a $9 bill before timing out. A loop that exits too early. model gives up after one tool failure. leaves users with half-finished work. Both look the same to the customer: “the agent didn’t help me.” Both look very different in a trace.

Roles feel it differently. The on-call engineer sees runaway-cost alerts when a loop misclassifies a tool error as the goal being unreachable, and bounces between retry and replan forever. The product lead sees abandoned-session rates climb when a loop stops short. The compliance reviewer sees actions taken at iteration four that no human approved, because the agent’s plan changed mid-loop.

In 2026 stacks, the loop runs inside agent frameworks. OpenAI Agents SDK’s Runner.run_sync, LangGraph’s graph.invoke, CrewAI’s Crew.kickoff. each with its own iteration semantics. The framework handles the mechanics, but the engineering responsibility for bounding the loop, measuring progress, and terminating cleanly stays with the team. If your loop has no progress eval and no hard cap, it has no production story. The public references that best surface unbounded-loop failures are τ-bench (Anthropic’s multi-turn customer-support set, where frontier agents plateau in the mid-60% range as iteration counts grow) and SWE-Bench Verified (500 human-validated GitHub issues), where successful runs cluster tightly in turn count and failed runs are almost always the long-tail outliers.

How FutureAGI Handles Agent Loops

FutureAGI’s approach is to make every loop iteration visible and evaluable. The traceAI integrations. traceAI-openai-agents, traceAI-langgraph, traceAI-crewai, traceAI-langchain. emit one OpenTelemetry span per iteration, tagged with agent.trajectory.step and an iteration counter, so dashboards can show iterations-per-trace as a histogram. On the eval side, StepEfficiency scores how many iterations the agent wasted versus the minimum needed, and GoalProgress scores forward progress between iterations. a flat GoalProgress curve is the textbook signature of a stuck loop.

Concretely: a research agent built on the OpenAI Agents SDK with GPT-5.1 is showing creeping latency. A FutureAGI trace dashboard reveals the iterations-per-trace p99 jumped from 4 to 14 after a tool spec was edited. StepEfficiency flags 9.4 wasted iterations on average; GoalProgress shows progress flatlines after iteration 4. the agent is calling the same broken search tool over and over. The team adds a per-tool failure cap, sets a hard max_turns=8, and ships a regression eval that fails any trace exceeding 6 iterations. Cost drops 60%. Without per-iteration spans and per-iteration evaluators, this would have shown up as “agent is slow” with nowhere to look. Unlike LangSmith’s trace view that shows iteration count, FutureAGI scores each iteration’s progress.

Loop termination conditions

A safe loop has more than one termination condition. The table below is the FutureAGI default. every production agent should ship with at least the first three.

ConditionDefault valueWhat it preventsWhere to enforce
max_turns cap8-12 for support, 20-30 for researchRunaway costFramework Runner config
Per-tool failure cap3 retries per unique toolSame broken tool spammingTool wrapper
Goal-progress timeoutFlatline GoalProgress for 3 stepsStuck reasoningFutureAGI evaluator alert
User session timeout5 minutes idleAbandoned conversationsSession layer
Cost cap per trace$0.50 typicalCost regressionsGateway / Agent Command Center
Explicit stop toolfinish or escalate_to_humanEndless polite refusalsSystem prompt

How to Measure or Detect It

Loops are best measured at the iteration level. total counts hide where progress stalls:

  • StepEfficiency: returns a score reflecting wasted iterations across the trajectory.
  • GoalProgress: returns per-step progress; a flat curve indicates a stuck loop.
  • TrajectoryScore: aggregates per-step quality across the entire loop.
  • iterations-per-trace (dashboard signal): the histogram of loop counts; p99 spikes are early warning.
  • agent.trajectory.step (OTel attribute): per-iteration span attribute; combined with iteration index gives you a per-step view.
  • infinite-loop alerts: hard alert when iteration count exceeds threshold; pair with agent-loop-detection.

Minimal Python:

from fi.evals import StepEfficiency, GoalProgress

step_eff = StepEfficiency()
progress = GoalProgress()

result = step_eff.evaluate(
    input=user_goal,
    trajectory=spans,
)
print(result.score, result.reason)

Common Mistakes

  • No max-iteration cap. A loop without a hard ceiling is one bug away from a runaway-cost incident; set max_turns or equivalent on every agent.
  • Treating each iteration as cheap. Ten LLM calls at $0.05 each is a $0.50 request. agents make this normal unless you measure.
  • Skipping GoalProgress. Without a progress signal, you cannot distinguish productive iterations from spinning ones.
  • Stopping on first tool failure. Real tools fail transiently; bake retry-with-jitter into the loop, but cap retries.
  • Confusing loop iteration with reasoning step. ReAct’s reason-act trace is one iteration; one iteration may produce many tool sub-calls. Be precise about which level you measure.

Frequently Asked Questions

What is an agent loop?

An agent loop is the reason-act-observe cycle that runs every agent step: the model reasons, the runtime acts on a tool, the result is observed and fed back, and the loop iterates until done.

How is an agent loop different from a workflow?

A loop is open-ended: the model decides the next step at every iteration. A workflow is structured: legal next steps are declared up front. Loops are flexible; workflows are predictable.

How do you measure an agent loop?

FutureAGI's StepEfficiency scores wasted iterations; GoalProgress scores forward progress per step. Pair both with a max-iteration cap to prevent runaway loops.