Agents

What Is an Agent Loop?

The reason-act-observe cycle that an agent runtime executes per step until a goal or stop condition is met.

What Is an Agent Loop?

An agent loop is the reason–act–observe cycle that drives every step of an AI agent. The model reasons about the goal given the current state. The runtime executes the chosen action — usually a tool call, retrieval, or sub-agent invocation. The observed result is fed back into the model. The loop iterates until a termination condition fires: the goal is reached, a max-iteration cap is hit, or the agent emits an explicit stop. In a FutureAGI trace, one pass through the loop is a trio of spans: the LLM-reasoning span, the action span, and the observation update.

Why It Matters in Production LLM and Agent Systems

The loop is where agents spend time and money. A free-form loop with no upper bound and no progress signal is the canonical 2026 incident: one user request produces 60 LLM calls and a $9 bill before timing out. A loop that exits too early — model gives up after one tool failure — leaves users with half-finished work. Both look the same to the customer: “the agent didn’t help me.” Both look very different in a trace.

Roles feel it differently. The on-call engineer sees runaway-cost alerts when a loop misclassifies a tool error as the goal being unreachable, and bounces between retry and replan forever. The product lead sees abandoned-session rates climb when a loop stops short. The compliance reviewer sees actions taken at iteration four that no human approved, because the agent’s plan changed mid-loop.

In 2026 stacks, the loop runs inside frameworks — OpenAI Agents SDK’s Runner.run_sync, LangGraph’s graph.invoke, CrewAI’s Crew.kickoff — each with its own iteration semantics. The framework handles the mechanics, but the engineering responsibility for bounding the loop, measuring progress, and terminating cleanly stays with the team. If your loop has no progress eval and no hard cap, it has no production story.

How FutureAGI Handles Agent Loops

FutureAGI’s approach is to make every loop iteration visible and evaluable. The traceAI integrations — traceAI-openai-agents, traceAI-langgraph, traceAI-crewai, traceAI-langchain — emit one OpenTelemetry span per iteration, tagged with agent.trajectory.step and an iteration counter, so dashboards can show iterations-per-trace as a histogram. On the eval side, StepEfficiency scores how many iterations the agent wasted versus the minimum needed, and GoalProgress scores forward progress between iterations — a flat GoalProgress curve is the textbook signature of a stuck loop.

Concretely: a research agent built on the OpenAI Agents SDK is showing creeping latency. A FutureAGI trace dashboard reveals the iterations-per-trace p99 jumped from 4 to 14 after a tool spec was edited. StepEfficiency flags 9.4 wasted iterations on average; GoalProgress shows progress flatlines after iteration 4 — the agent is calling the same broken search tool over and over. The team adds a per-tool failure cap, sets a hard max_turns=8, and ships a regression eval that fails any trace exceeding 6 iterations. Cost drops 60%. Without per-iteration spans and per-iteration evaluators, this would have shown up as “agent is slow” with nowhere to look.

How to Measure or Detect It

Loops are best measured at the iteration level — total counts hide where progress stalls:

  • StepEfficiency: returns a score reflecting wasted iterations across the trajectory.
  • GoalProgress: returns per-step progress; a flat curve indicates a stuck loop.
  • TrajectoryScore: aggregates per-step quality across the entire loop.
  • iterations-per-trace (dashboard signal): the histogram of loop counts; p99 spikes are early warning.
  • agent.trajectory.step (OTel attribute): per-iteration span attribute; combined with iteration index gives you a per-step view.
  • infinite-loop alerts: hard alert when iteration count exceeds threshold; pair with agent-loop-detection.

Minimal Python:

from fi.evals import StepEfficiency, GoalProgress

step_eff = StepEfficiency()
progress = GoalProgress()

result = step_eff.evaluate(
    input=user_goal,
    trajectory=spans,
)
print(result.score, result.reason)

Common Mistakes

  • No max-iteration cap. A loop without a hard ceiling is one bug away from a runaway-cost incident; set max_turns or equivalent on every agent.
  • Treating each iteration as cheap. Ten LLM calls at $0.05 each is a $0.50 request — agents make this normal unless you measure.
  • Skipping GoalProgress. Without a progress signal, you cannot distinguish productive iterations from spinning ones.
  • Stopping on first tool failure. Real tools fail transiently; bake retry-with-jitter into the loop, but cap retries.
  • Confusing loop iteration with reasoning step. ReAct’s reason-act trace is one iteration; one iteration may produce many tool sub-calls. Be precise about which level you measure.

Frequently Asked Questions

What is an agent loop?

An agent loop is the reason-act-observe cycle that runs every agent step: the model reasons, the runtime acts on a tool, the result is observed and fed back, and the loop iterates until done.

How is an agent loop different from a workflow?

A loop is open-ended: the model decides the next step at every iteration. A workflow is structured: legal next steps are declared up front. Loops are flexible; workflows are predictable.

How do you measure an agent loop?

FutureAGI's StepEfficiency scores wasted iterations; GoalProgress scores forward progress per step. Pair both with a max-iteration cap to prevent runaway loops.