Models

What Is SHAP?

A model-explanation framework — Shapley Additive Explanations — that uses cooperative game-theoretic Shapley values to attribute each feature's contribution to a single prediction.

What Is SHAP?

SHAP, short for SHapley Additive exPlanations, is a unified framework for explaining individual predictions of a machine-learning model by assigning each input feature a contribution score. The scores are Shapley values from cooperative game theory: the fair payout to a “player” (feature) given all possible coalitions of other features. SHAP, introduced by Lundberg and Lee in 2017, unifies prior attribution methods (LIME, DeepLIFT, integrated gradients) under axioms of local accuracy, missingness, and consistency. It is widely used on tabular and tree-based models; for LLMs, token-saliency and circuit analysis are more common.

Why It Matters in Production LLM and Agent Systems

Model decisions move money, deny loans, route care, and approve content. Saying “the model said so” is not enough — auditors, end-users, and engineers all need to know why. SHAP gives a per-prediction breakdown: this feature pushed the score up by 0.18, that one pulled it down by 0.07. The sum equals the prediction. That property — local accuracy — is what makes SHAP defensible in regulated workflows.

The pain of skipping explanation shows up across roles. A compliance lead is asked under the EU AI Act, “explain why this individual was rejected” and has nothing structured to hand over. A data scientist debugs a fairness regression and cannot tell whether the model is using a protected feature directly or a correlated proxy. A product manager rolls out a credit-decision model and gets a stream of customer disputes she cannot answer.

In 2026 agent stacks, classical SHAP has limited coverage — most generative outputs are produced by transformers where Shapley computation over tokens is intractable. But SHAP still matters where agents call into tabular models for routing, scoring, or risk classification: a triage agent that uses an XGBoost classifier to prioritize tickets needs SHAP-style explanations on each classification, especially when an end-user disputes the routing decision.

How FutureAGI Handles SHAP

FutureAGI does not compute SHAP values directly — Shapley computation belongs to the modeling library (the shap Python package, tree-explainer kernels, etc.). We sit downstream. At the evaluation level, when a model ships with SHAP attributions as part of its output, we treat the attributions as additional fields on the trace and let you write CustomEvaluation rules over them — for example, alert when a protected feature’s SHAP magnitude exceeds a threshold for any prediction. At the audit level, FutureAGI’s audit-log and Dataset versioning preserve the inputs and outputs of every prediction, so a SHAP recomputation against the same input gives the same result months later — reproducibility is the precondition for explanation. At the bias level, the BiasDetection evaluator complements SHAP by scoring outputs across cohorts; SHAP tells you why one prediction; BiasDetection tells you whether the model behaves differently across groups.

Concretely: a fintech team ships a credit-scoring model with SHAP attributions logged on every prediction. They route every prediction through traceAI as a span with shap.values as a span attribute, sample disputed predictions into a Dataset, and run a CustomEvaluation that flags any prediction where SHAP attribution to zip_code exceeds 0.05. FutureAGI does not generate the SHAP value, but it makes the explanation queryable, auditable, and testable — the difference between “we have explanations” and “we can prove the explanations were not used as a cover for proxy bias.”

How to Measure or Detect It

SHAP-driven measurement signals to wire into your reliability stack:

  • Per-feature attribution magnitude — the absolute SHAP value per feature, per prediction. Spikes on protected or proxy features should fire alerts.
  • Global feature importance — mean absolute SHAP value across a cohort; a stable summary you can monitor for drift over time.
  • BiasDetection evaluator — pairs with SHAP: confirms whether attribution differences across cohorts correspond to real outcome differences.
  • Reproducibility check — recompute SHAP values on the same input across model versions; a drift in attributions on identical inputs indicates a model change.

Minimal Python — using FutureAGI to log SHAP values alongside predictions:

import shap
from fi.client import Client

explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(features)

Client().log(
    inputs=features,
    outputs=prediction,
    metadata={"shap_values": shap_values.tolist()},
)

Common Mistakes

  • Treating SHAP as causal. SHAP explains the model’s behavior on the input, not the real-world causal mechanism — high SHAP on a feature does not prove the feature causes the outcome.
  • Using kernel SHAP on transformers. Kernel-SHAP scales O(2^n) in feature count and is intractable for token-level LLM inputs. Use token-saliency or circuit analysis instead.
  • Aggregating SHAP without cohorts. A global mean hides cohort-level disparity. Slice by user segment, geography, and protected attribute.
  • Ignoring feature correlation. SHAP values split credit among correlated features in unintuitive ways — a clean attribution can still hide proxy bias.
  • Stopping at the SHAP plot. The plot is the diagnosis, not the remedy. Pair every flagged feature with a fix path: feature removal, reweighting, model swap.

Frequently Asked Questions

What is SHAP?

SHAP (SHapley Additive exPlanations) is a model-explanation framework that uses cooperative-game Shapley values to fairly attribute each input feature's contribution to a single prediction, unifying methods like LIME and integrated gradients under one axiom set.

How is SHAP different from LIME?

LIME fits a local linear surrogate to approximate a model around one prediction. SHAP guarantees consistency and local accuracy via Shapley-value math, so feature attributions sum exactly to the prediction and are stable across runs — properties LIME does not promise.

How does FutureAGI work with SHAP?

FutureAGI does not compute SHAP values. We evaluate model outputs and behavior; if a tabular model ships with SHAP-based explanations, FutureAGI's BiasDetection and audit-log surfaces help verify that the explanations track real performance across cohorts.