What Are Shapley Values? Definition & ML Use (2026)

What Is a Shapley Value?

Shapley values, introduced by Lloyd Shapley in 1953, are a cooperative game-theory concept that fairly distributes a total payout among players according to each player’s average marginal contribution across every possible coalition. The allocation satisfies four axioms — efficiency (the values sum to the total), symmetry (interchangeable players get equal value), dummy (a player who contributes nothing gets zero), and additivity (composability across games). In machine learning, Shapley values are the foundation of SHAP-based explanation: features are the players, the prediction is the payout, and each feature gets a contribution score that respects all four axioms.

Why It Matters in Production LLM and Agent Systems

Shapley values matter because they are the only attribution method with an axiomatic fairness guarantee. Other ML-explanation techniques approximate or relax these properties; Shapley values do not. That mathematical property is what makes them defensible in regulated settings, where “fair attribution” is a question lawyers ask, not just engineers.

The pain of choosing weaker explanation methods shows up across roles. A compliance lead defends a credit-decision model under the EU AI Act and is asked, “is this attribution method consistent across runs?” — only Shapley-grounded methods can say yes. A fairness engineer wants to combine local explanations across sub-models in a stack and discovers that non-Shapley methods do not compose cleanly because they violate additivity. A product manager presenting model behavior to the board wants the attributions to add up to the prediction; Shapley values are the only ones that do exactly.

In 2026 agent stacks, Shapley-based attribution mostly applies to the tabular models agents call as tools. LLM token attribution typically uses gradient methods or circuit analysis instead — Shapley computation over a 4096-token input is intractable. But hybrid stacks (LLM agent + XGBoost scorer + retrieval ranker) need attribution somewhere in the pipeline, and where it appears, Shapley is the right primitive.

How FutureAGI Handles Shapley Values

FutureAGI does not compute Shapley values directly — that lives in attribution libraries (the shap package, fastshap, kernel-SHAP, tree-SHAP). We sit downstream of the attribution and turn the values into operational signals. At the trace level, Shapley-style attributions are logged as span metadata via fi.client.Client.log, so every prediction in production carries its attribution. At the evaluation level, a CustomEvaluation runs over the attribution vector to detect drift, threshold violations, or proxy-bias signatures. At the bias level, the BiasDetection evaluator scores outcome disparity across cohorts; pairing a Shapley attribution that flags a feature with a BiasDetection score that confirms outcome disparity is the audit-grade signal regulators expect.

Concretely: a healthcare-triage team computes Shapley values per prediction on their tree-based risk model and writes them into traceAI spans alongside the model output. They run a CustomEvaluation weekly: if mean Shapley attribution to a proxy feature (insurance type, ZIP) shifts more than 0.03 between model versions, the evaluation fails. They then run BiasDetection over outcome distributions across patient cohorts; when both signals fire, the audit flags it for human review. FutureAGI does not own the Shapley math, but we own the workflow that turns Shapley attributions into a reliability signal.

How to Measure or Detect It

Shapley-derived signals to wire into operational dashboards:

Mean absolute Shapley value per feature — global importance summary; track drift over time.
Per-prediction Shapley magnitude — alert when any single feature exceeds a threshold on protected attributes.
Coalition-stability check — recompute Shapley on identical inputs across model versions; values should match exactly on identical models, drift only on real changes.
BiasDetection — pairs with Shapley attribution to verify whether attribution-flagged features map to outcome disparity.
Coverage rate — percentage of production predictions with Shapley values logged; gaps break the audit chain.

Minimal Python — compute and log Shapley values:

import shap
from fi.client import Client

explainer = shap.TreeExplainer(model)
shapley = explainer.shap_values(X_row)

Client().log(
    inputs=X_row.to_dict(),
    outputs={"score": float(prediction)},
    metadata={"shapley_values": shapley.tolist()},
)

Common Mistakes

Treating Shapley as causal. Shapley values describe the model’s response to features, not the real-world causal mechanism. They diagnose model behavior, not the world.
Using exact Shapley on high-dimensional input. Exact Shapley scales O(2^n) in feature count; for >20 features use sampling-based or tree-specific approximations.
Ignoring feature correlations. Correlated features split Shapley credit unintuitively. A clean attribution to a non-protected feature can still hide a proxy.
Comparing Shapley across model architectures. Values are model-relative. A tree-SHAP attribution and a deep-learning kernel-SHAP attribution are not directly comparable.
Stopping at the visualization. Shapley plots diagnose; they do not fix. Pair every flagged feature with a remediation step — feature removal, reweighting, or model swap.

Frequently Asked Questions

What is a Shapley value?

A Shapley value is the fair share of a total payout assigned to a single cooperating player, based on that player's average marginal contribution across all possible coalitions. The full set of Shapley values satisfies four axioms — efficiency, symmetry, dummy, and additivity — that make them the unique fair allocation.

How are Shapley values different from SHAP values?

Shapley values are a general game-theory concept. SHAP values are Shapley values applied to ML model attribution, where features are players and the prediction is the payout. Every SHAP value is a Shapley value, but Shapley values exist in many domains beyond ML.

Why do Shapley values matter for FutureAGI?

Shapley-based attribution is the math behind explainability tools we audit. FutureAGI's BiasDetection complements Shapley-style explanations by scoring outcome disparity across cohorts when an attribution flags a suspicious feature.