我正在使用 Plotly 从 SHAP 库重新创建摘要图。我有两个数据集:
我的目标是创建 SHAP 值的蜂群图,并根据 one-hot 编码变量中的对应点是 0 还是 1 为每个点分配不同的颜色。
这是我的代码,它生成所描述的图,但仅针对一个变量:
figures = []
for column in shap_values.columns:
fig = px.strip(merged_df, x=merged_df[column+'_shap'], color=merged_df[column+'_train'], orientation='h', stripmode='overlay')
fig.update_layout(
title=f'Bee swarm plot de la valeur de Shapley pour {column}',
xaxis_title='Valeur de Shapley (impact sur la sortie du modèle)',
yaxis_title='Caractéristique'
)
figures.append(fig)
有没有办法将所有这些图合并成一个综合图?
这是数据示例:
shap_values = pd.DataFrame(
{"A" : [-0.065704,-0.096510,0.062368,0.062368,0.063093],
'B' : [-0.168249,-0.173284,-0.168756,-0.168756,-0.169378]})
train = pd.DataFrame(
{"A" : [0,1,1,0,0],
'B' : [1,1,0,0,1]})
merged_df = shap_values.join(train, lsuffix='_shap', rsuffix='_train’)
import pandas as pd
import plotly.express as px
# Sample data
shap_values = pd.DataFrame({
"A": [-0.065704, -0.096510, 0.062368, 0.062368,
0.063093],
"B": [-0.168249, -0.173284, -0.168756, -0.168756,
-0.169378]
})
train = pd.DataFrame({
"A": [0, 1, 1, 0, 0],
"B": [1, 1, 0, 0, 1]
})
# Joining SHAP values and one-hot encoded features
merged_df = shap_values.join(train, lsuffix='_shap',
rsuffix='_train')
# Melt the merged DataFrame to long format
melted_df = merged_df.melt(value_vars=[col for col in
merged_df.columns if '_shap' in col],
var_name='Feature',
value_name='SHAP Value')
# Extract the original feature name and merge with the
# one-hot encoded values
melted_df['Feature'] =
melted_df['Feature'].str.replace('_shap', '')
melted_df['One-hot Value'] = melted_df.apply(lambda x:
merged_df.loc[x.name, x['Feature'] + '_train'], axis=1)
fig = px.strip(melted_df, x='SHAP Value', y='Feature',
color='One-hot Value',
orientation='h', stripmode='overlay',
title='Bee Swarm Plot of SHAP Values by Feature')
fig.update_layout(
xaxis_title='SHAP Value (Impact on Model Output)',
yaxis_title='Feature')
fig.show()