如何为顺序 Keras 模型创建表格数据的 LIME 解释？

Question

如何使用 LIME 可解释性方法（特别是 LimeTabularExplainer）来解释神经网络（Sequential Keras 模型）？

我正在使用成人数据集（表格数据的二元分类）。我使用 sklearn 使用 One-Hot-Encoding 对其进行编码。之后，我将使用 Keras Sequential 模型创建一个神经网络。创建模型的代码如下：

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop

model = Sequential([
  Dense(104, activation='relu', input_shape = [X_train_ohe.shape[1]]),
  Dropout(0.2),
  Dense(256, activation='relu'),
  Dropout(0.2),
  Dense(32, activation='relu'),  
  Dropout(0.2),
  Dense(1, activation='sigmoid')                     
])

model.compile(optimizer=RMSprop(learning_rate = 0.0001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

history = model.fit(X_train_ohe, y_train, epochs=20, validation_data=(X_test_ohe, y_test))

test_loss, test_acc = model.evaluate(X_test_ohe,  y_test, verbose=2)

我想用LIME来解释模型。这是创建 LIME 解释的代码。但是，出现以下错误信息：

import lime
import lime.lime_tabular

feature_names = list(data_enc.columns[:-1])
class_names = list(np.unique(data.salary))

predict_fn = lambda x: model.predict(encoder.transform(x))

explainer = lime.lime_tabular.LimeTabularExplainer(X_train ,feature_names = feature_names,
                                                   class_names=class_names,
                                                   categorical_features=categorical_features, 
                                                   categorical_names=categorical_names, kernel_width=3)

exp = explainer.explain_instance(X_test[0], predict_fn, num_features=5)
exp.show_in_notebook(show_all=False)

错误：

157/157 [==============================] - 0s 1ms/step

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_4301/2163859068.py in <module>
     12                                                    categorical_names=categorical_names, kernel_width=3)
     13 
---> 14 exp = explainer.explain_instance(X_test[0], predict_fn, num_features=5)
     15 exp.show_in_notebook(show_all=False)

~/anaconda3/lib/python3.9/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
    450             (ret_exp.intercept[label],
    451              ret_exp.local_exp[label],
--> 452              ret_exp.score, ret_exp.local_pred) = self.base.explain_instance_with_data(
    453                     scaled_data,
    454                     yss,

~/anaconda3/lib/python3.9/site-packages/lime/lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor)
    180 
    181         weights = self.kernel_fn(distances)
--> 182         labels_column = neighborhood_labels[:, label]
    183         used_features = self.feature_selection(neighborhood_data,
    184                                                labels_column,

IndexError: index 1 is out of bounds for axis 1 with size 1

我也尝试过应用 RecurrentTabularExplainer，但不幸的是没有成功：

explainer = lime.lime_tabular.RecurrentTabularExplainer(X_train, training_labels=y_train, 
                                                        feature_names=feature_names,
                                                        discretize_continuous=True,
                                                        class_names=class_names,
                                                        categorical_features=categorical_features,
                                                        categorical_names=categorical_names,
                                                        discretizer='decile')

错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_4301/2936629705.py in <module>
----> 1 explainer = lime.lime_tabular.RecurrentTabularExplainer(X_train, training_labels=y_train, 
      2                                                         feature_names=feature_names,
      3                                                    discretize_continuous=True,
      4                                                    class_names=class_names,
      5                                                         categorical_features=categorical_features,

~/anaconda3/lib/python3.9/site-packages/lime/lime_tabular.py in __init__(self, training_data, mode, training_labels, feature_names, categorical_features, categorical_names, kernel_width, kernel, verbose, class_names, feature_selection, discretize_continuous, discretizer, random_state)
    613 
    614         # Reshape X
--> 615         n_samples, n_timesteps, n_features = training_data.shape
    616         training_data = np.transpose(training_data, axes=(0, 2, 1)).reshape(
    617                 n_samples, n_timesteps * n_features)

ValueError: not enough values to unpack (expected 3, got 2)

提前非常感谢您，非常感谢您的帮助！

如何为顺序 Keras 模型创建表格数据的 LIME 解释？

问题描述投票：0回答：0

最新问题

如何为顺序 Keras 模型创建表格数据的 LIME 解释？

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0