如何使用 LIME 可解释性方法(特别是 LimeTabularExplainer)来解释神经网络(Sequential Keras 模型)?
我正在使用成人数据集(表格数据的二元分类)。我使用 sklearn 使用 One-Hot-Encoding 对其进行编码。之后,我将使用 Keras Sequential 模型创建一个神经网络。创建模型的代码如下:
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
model = Sequential([
Dense(104, activation='relu', input_shape = [X_train_ohe.shape[1]]),
Dropout(0.2),
Dense(256, activation='relu'),
Dropout(0.2),
Dense(32, activation='relu'),
Dropout(0.2),
Dense(1, activation='sigmoid')
])
model.compile(optimizer=RMSprop(learning_rate = 0.0001),
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(X_train_ohe, y_train, epochs=20, validation_data=(X_test_ohe, y_test))
test_loss, test_acc = model.evaluate(X_test_ohe, y_test, verbose=2)
我想用LIME来解释模型。这是创建 LIME 解释的代码。但是,出现以下错误信息:
import lime
import lime.lime_tabular
feature_names = list(data_enc.columns[:-1])
class_names = list(np.unique(data.salary))
predict_fn = lambda x: model.predict(encoder.transform(x))
explainer = lime.lime_tabular.LimeTabularExplainer(X_train ,feature_names = feature_names,
class_names=class_names,
categorical_features=categorical_features,
categorical_names=categorical_names, kernel_width=3)
exp = explainer.explain_instance(X_test[0], predict_fn, num_features=5)
exp.show_in_notebook(show_all=False)
错误:
157/157 [==============================] - 0s 1ms/step
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
/tmp/ipykernel_4301/2163859068.py in <module>
12 categorical_names=categorical_names, kernel_width=3)
13
---> 14 exp = explainer.explain_instance(X_test[0], predict_fn, num_features=5)
15 exp.show_in_notebook(show_all=False)
~/anaconda3/lib/python3.9/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
450 (ret_exp.intercept[label],
451 ret_exp.local_exp[label],
--> 452 ret_exp.score, ret_exp.local_pred) = self.base.explain_instance_with_data(
453 scaled_data,
454 yss,
~/anaconda3/lib/python3.9/site-packages/lime/lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor)
180
181 weights = self.kernel_fn(distances)
--> 182 labels_column = neighborhood_labels[:, label]
183 used_features = self.feature_selection(neighborhood_data,
184 labels_column,
IndexError: index 1 is out of bounds for axis 1 with size 1
我也尝试过应用 RecurrentTabularExplainer,但不幸的是没有成功:
explainer = lime.lime_tabular.RecurrentTabularExplainer(X_train, training_labels=y_train,
feature_names=feature_names,
discretize_continuous=True,
class_names=class_names,
categorical_features=categorical_features,
categorical_names=categorical_names,
discretizer='decile')
错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_4301/2936629705.py in <module>
----> 1 explainer = lime.lime_tabular.RecurrentTabularExplainer(X_train, training_labels=y_train,
2 feature_names=feature_names,
3 discretize_continuous=True,
4 class_names=class_names,
5 categorical_features=categorical_features,
~/anaconda3/lib/python3.9/site-packages/lime/lime_tabular.py in __init__(self, training_data, mode, training_labels, feature_names, categorical_features, categorical_names, kernel_width, kernel, verbose, class_names, feature_selection, discretize_continuous, discretizer, random_state)
613
614 # Reshape X
--> 615 n_samples, n_timesteps, n_features = training_data.shape
616 training_data = np.transpose(training_data, axes=(0, 2, 1)).reshape(
617 n_samples, n_timesteps * n_features)
ValueError: not enough values to unpack (expected 3, got 2)
提前非常感谢您,非常感谢您的帮助!