SHAP is a model explainability framework built upon the concept of Shapley values. SHAP seeks to address the computational challenges of directly calculating Shapley values by using a combination of clever approximations and insights from game theory. Here's the gist:
- Baseline Prediction: Establish a baseline using the average prediction across all data instances.
- Feature Permutations: Generate all possible feature orderings (permutations). Note that the order in which features are introduced can drastically impact their marginal contribution.
- Build and Retrain Models: For each ordering, build models by adding features one-by-one according to the permutation. Calculate the model's predictions with and without the current feature to assess its marginal contribution in that specific order.
- SHAP Value Approximation: Average these marginal contributions across all possible feature orderings, effectively approximating the Shapley values that would be computationally intractable to calculate directly.
Strengths
- Solid Theoretical Grounding: Rooted in the fairness properties of Shapley values.
- Global and Local Interpretability: SHAP explanations offer insights into the overall model (how features generally affect output) and individual predictions (why the model arrived at a specific decision).
- Consistency: Unlike some other explainability methods, SHAP ensures that the explanations sum up to the original model's output.
- Model Agnostic: Works with a wide variety of machine learning models (tree-based, linear, neural networks, etc.).
Weaknesses
- Approximation: Still an approximation of true Shapley values, although computationally efficient techniques often provide good results.
- Feature Correlation: Can get complex with highly correlated features, as the independence assumption of Shapley values is somewhat challenged.
- Human Interpretation: While SHAP provides valuable insights, interpreting feature importance still requires a degree of domain expertise.