找出朴素贝叶斯分类器使用哪些词来进行决策

Question

我正在 Python 中使用朴素贝叶斯进行文本分类，并想找出哪些单词用于确定文本所属的类别。

我找到了这个答案https://stackoverflow.com/a/62097661/3992979，但这对我没有帮助，因为我的矢量化器没有

get_feature_names()

方法，而我的朴素贝叶斯分类器没有

coef_

属性。

df_train

是带有手动标记的训练数据的数据框

df_test

是带有未标记数据的数据框NB应该分类。只有两个类别，“恐怖”1 表示有关恐怖主义袭击的文本，“恐怖”0 表示没有该主题的文本。

### Create "Bag of Words"
vec = CountVectorizer(
    ngram_range=(1, 3)
)

x_train = vec.fit_transform(df_train.clean_text)
x_test = vec.transform(df_test.clean_text)

y_train = df_train.terror
y_test = df_test.terror

### Train and evaluate the model (Naive Bayes classification)
nb = MultinomialNB()
nb.fit(x_train, y_train)

preds = nb.predict(x_test)

Answer 1

我通过反复试验找到了答案：

### Get the words that trigger the AI detection
features_log_prob = nb.feature_log_prob_
feature_names = vec.get_feature_names_out()

def show_top100(classifier, vectorizer, categories):
  feature_names = vectorizer.get_feature_names_out()
  for i, category in enumerate(categories):
    top100 = np.argsort(classifier.feature_log_prob_[i])[-100:]
    print("%s: %s" % (category, " ".join(feature_names[top100])))

show_top100(nb, vec, nb.classes_)

找出朴素贝叶斯分类器使用哪些词来进行决策

问题描述投票：0回答：1

1个回答

最新问题

找出朴素贝叶斯分类器使用哪些词来进行决策

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1