How is agent occupancy different from agent utilization?

Occupancy excludes break and unavailable time — it is the share of available time spent on contacts. Utilization is the share of total scheduled time spent on contacts. Occupancy is always >= utilization.

Does FutureAGI measure agent occupancy?

FutureAGI does not measure human-rep occupancy directly — that is a workforce-management tool's job. It does evaluate voice-AI agents, which is the closest AI equivalent: ASRAccuracy, AudioQualityEvaluator, and ConversationResolution score the voice fleet handling load.

Agent Occupancy: Definition & FutureAGI Guide (2026)

What Is Agent Occupancy?

Agent occupancy is a contact-center workforce-management metric: the percentage of an agent’s logged-in time spent actively handling contacts (talk time plus after-call work) versus available-but-idle time. The standard formula is (handle time / logged-in time) × 100. High occupancy (>90%) signals saturation and burnout risk; low occupancy (<60%) signals over-provisioned staffing. The AI-agent equivalent is concurrent-session utilization across a voice-agent fleet, which FutureAGI scores differently than human-rep occupancy because the failure modes — burnout, attrition, schedule adherence — do not apply to LLM-based agents.

Why agent occupancy matters in production LLM and agent systems

For human contact centers, occupancy is the central WFM lever — it links staffing cost to customer wait time and to attrition. The trade-off is well-studied: every percentage point above 90% raises attrition risk, but every point below 70% inflates the cost-per-contact. Workforce-management platforms exist to keep occupancy in a target band.

For AI-agent fleets the equivalent question is different but related: are you over-provisioning compute, or are voice agents queueing requests because the fleet is saturated? An LLM voice agent does not burn out, but it does experience cold-start latency when scaled from idle, increased p99 latency under contention, and degraded ASR accuracy when audio infra is overloaded. SREs running voice fleets see those symptoms during traffic spikes — and they correspond, loosely, to “high occupancy” on the AI side.

A platform engineer cares about session concurrency per replica, time-to-first-audio under load, and queue depth. A product reviewer cares about whether the agent’s voice quality degrades when the fleet is saturated — pauses, repeated greetings, audio glitches. A finance lead cares about the cost-per-handled-call ratio. None of these match the human-WFM occupancy formula precisely, which is why FutureAGI measures the AI side with different evaluators rather than re-using the WFM term.

How FutureAGI handles agent occupancy

FutureAGI does not measure agent occupancy directly — that is a workforce-management tool’s job (NICE, Genesys, Talkdesk all expose occupancy in their WFM modules, and they are the right surfaces for human-rep scheduling). The closest related capability inside FutureAGI is voice-AI agent evaluation. The traceAI livekit and pipecat integrations capture every voice-agent session as an OpenTelemetry span tagged with session ID, agent name, and agent.trajectory.step. Evaluators like ASRAccuracy, AudioQualityEvaluator, and ConversationResolution score whether the fleet is actually handling load well, regardless of headcount math.

FutureAGI’s approach is to treat occupancy as a correlation signal for AI fleet contention, not as the primary reliability metric. Unlike Erlang C staffing models or NICE occupancy dashboards, the AI workflow asks whether sessions stayed intelligible, timely, and resolved under concurrent load.

Concrete example: a voice-AI fleet running on LiveKit shows time-to-first-audio degrading from 380ms p99 at 100 concurrent sessions to 1.2s p99 at 400. FutureAGI’s AudioQualityEvaluator flags a 14% jump in glitch-rate. That is not “high occupancy” in the WFM sense — it is fleet contention. The fix is autoscaling policy on the inference-engine layer, not adding human reps. The simulate SDK’s LiveKitEngine lets the team replay the same load profile pre-deploy and verify the autoscaler holds quality at target concurrency.

For the measurable latency side of this concept, see time to first audio. For human-WFM occupancy specifically, integrate your WFM platform’s exports into a FutureAGI dataset only if you want to correlate human-handled volume with AI-deflection rate.

How to measure or detect agent occupancy

For human contact centers, occupancy is computed by your WFM platform; FutureAGI does not replicate it. For AI voice fleets, the analogous signals are:

time-to-first-audio p99 (dashboard signal): canonical voice-agent latency metric; rises under fleet contention.
AudioQualityEvaluator: scores per-session audio for glitches, dropouts, and codec artifacts that spike under load.
ASRAccuracy: word-error-rate against reference; degrades when the audio pipeline is saturated.
ConversationResolution: did the call actually close successfully — the AI-side equivalent of “handle time well spent.”
concurrent-sessions per replica (dashboard signal): the AI-fleet equivalent of occupancy; the right autoscaling target.
cost-per-handled-call: financial KPI that pairs with concurrent-sessions for capacity planning.

from fi.evals import AudioQualityEvaluator, ASRAccuracy

audio = AudioQualityEvaluator().evaluate(
    audio_path="/sessions/abc.wav",
)
asr = ASRAccuracy().evaluate(
    audio_path="/sessions/abc.wav",
    reference_text=ground_truth,
)
print(audio.score, asr.score)

Common mistakes

Applying human-WFM occupancy formulas to AI agents. LLM agents do not have break time or schedule adherence; the formula does not transfer.
Targeting 100% occupancy. Even on AI fleets, sustained 100% session utilization triggers cold-start spikes when traffic shifts; keep headroom.
Treating low occupancy as wasted spend. Some idle capacity is the buffer that holds p99 latency at SLA; price it as insurance, not waste.
Conflating occupancy with utilization. Utilization includes break time in the denominator; occupancy does not. Mixing them produces wrong staffing models.
Hand-coding occupancy when your WFM platform already exports it. Use the WFM API; only re-derive it for the AI-fleet equivalent.