我有一个关于 XGBoost 分类器的形状解释的问题。特别是我们如何理解explainer.expected_value?为什么经过sigmoid变换后,与y_train.mean()不一样?非常感谢!
下面是代码摘要,供快速参考。本笔记本中提供了完整代码:https://github.com/MenaWANG/ML_toy_examples/blob/main/explain%20models/shap_XGB_classification.ipynb
model = xgb.XGBClassifier()
model.fit(X_train, y_train)
explainer = shap.Explainer(model)
shap_test = explainer(X_test)
shap_df = pd.DataFrame(shap_test.values)
#For each case, if we add up shap values across all features plus the expected value, we can get the margin for that case, which then can be transformed to return the predicted prob for that case:
np.isclose(model.predict(X_test, output_margin=True),explainer.expected_value + shap_df.sum(axis=1))
#True
But why isn't the below true? Why after sigmoid transformation, the explainer.expected_value is not the same with y_train.mean() for XGBoost classifiers? Thx again!
expit(explainer.expected_value) == y_train.mean()
#False
SHAP
保证在原始分数空间(logits)中是可加的。要理解为什么原始分数中的可加性不能扩展到预测中的可加性,您可能会思考为什么exp(x+y) != exp(x) + exp(y)