sklearn：多类问题和报告敏感性和特异性

Question

我有一个三类问题，我可以使用以下代码报告每个类的精确度和召回率：

from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))

这为我提供了表格格式中 3 个类别中每一个类别的精确度和召回率。

我的问题是现在如何才能获得 3 个类别中每个类别的敏感性和特异性？我查看了 sklearn.metrics，没有找到任何报告敏感性和特异性的内容。

Answer 1

如果我们检查分类报告的帮助页面：

请注意，在二元分类中，正类的召回率是也称为“敏感性”；负类的回忆是 “特异性”。

因此我们可以将每个类别的 pred 转换为二进制，然后使用

precision_recall_fscore_support

的召回结果。

使用示例：

from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))

看起来像：

              precision    recall  f1-score   support

     class 0       0.50      1.00      0.67         1
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.67      0.80         3

    accuracy                           0.60         5
   macro avg       0.50      0.56      0.49         5
weighted avg       0.70      0.60      0.61         5

使用sklearn：

from sklearn.metrics import precision_recall_fscore_support
res = []
for l in [0,1,2]:
    prec,recall,_,_ = precision_recall_fscore_support(np.array(y_true)==l,
                                                      np.array(y_pred)==l,
                                                      pos_label=True,average=None)
    res.append([l,recall[0],recall[1]])

将结果放入数据框中：

pd.DataFrame(res,columns = ['class','sensitivity','specificity'])

    class   sensitivity specificity
0   0   0.75    1.000000
1   1   0.75    0.000000
2   2   1.00    0.666667

Answer 2

分类报告的输出是格式化字符串。此代码片段提取所需的值并将其存储在二维列表中。注意：为了更好地理解代码，请添加打印语句来检查变量值。

y = classification_report(y_test,y_pred) #classification report's output is a string
    lines = y.split('\n') #extract every line and store in a list 
    res = [] #list to store the cleaned results 
    for i in range(len(lines)):
        line = lines[i].split(" ") #Values are separated by blanks. Split at the blank spaces. 
        line = [j for j in line if j!=''] #add only the values into the list
        if len(line) != 0: 
            #empty lines get added as empty lists. Remove those 
            res.append(line)

Answer 3

基于@StupidWolf 评论：

import numpy as np
import pandas as pd
from IPython.display import display  # Optional
from sklearn.metrics import (
    classification_report,
    precision_recall_fscore_support,
)

res = []
for class_p in classes:
    prec, recall, fbeta_score, support = precision_recall_fscore_support(
        np.array(y_true) == class_p,
        np.array(y_pred) == class_p,
        pos_label=True,
        average=None,
    )
    res.append(
        [
            class_p,
            prec[1],
            recall[1],
            recall[0],
            fbeta_score[1],
            support[1],
        ]
    )

df_res  = pd.DataFrame(
    res,
    columns=[
        "class",
        "precision",
        "recall",
        "specificity",
        "f1-score",∏
        "support",
    ],
)

display(df_res)

sklearn：多类问题和报告敏感性和特异性

问题描述投票：0回答：3

3个回答

最新问题

sklearn：多类问题和报告敏感性和特异性

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3