我有一个三类问题,我可以使用以下代码报告每个类的精确度和召回率:
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
这为我提供了表格格式中 3 个类别中每一个类别的精确度和召回率。
我的问题是现在如何才能获得 3 个类别中每个类别的敏感性和特异性?我查看了 sklearn.metrics,没有找到任何报告敏感性和特异性的内容。
如果我们检查分类报告的帮助页面:
请注意,在二元分类中,正类的召回率是 也称为“敏感性”;负类的回忆是 “特异性”。
因此我们可以将每个类别的 pred 转换为二进制,然后使用
precision_recall_fscore_support
的召回结果。
使用示例:
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))
看起来像:
precision recall f1-score support
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
使用sklearn:
from sklearn.metrics import precision_recall_fscore_support
res = []
for l in [0,1,2]:
prec,recall,_,_ = precision_recall_fscore_support(np.array(y_true)==l,
np.array(y_pred)==l,
pos_label=True,average=None)
res.append([l,recall[0],recall[1]])
将结果放入数据框中:
pd.DataFrame(res,columns = ['class','sensitivity','specificity'])
class sensitivity specificity
0 0 0.75 1.000000
1 1 0.75 0.000000
2 2 1.00 0.666667
分类报告的输出是格式化字符串。此代码片段提取所需的值并将其存储在二维列表中。 注意:为了更好地理解代码,请添加打印语句来检查变量值。
y = classification_report(y_test,y_pred) #classification report's output is a string
lines = y.split('\n') #extract every line and store in a list
res = [] #list to store the cleaned results
for i in range(len(lines)):
line = lines[i].split(" ") #Values are separated by blanks. Split at the blank spaces.
line = [j for j in line if j!=''] #add only the values into the list
if len(line) != 0:
#empty lines get added as empty lists. Remove those
res.append(line)
基于@StupidWolf 评论:
import numpy as np
import pandas as pd
from IPython.display import display # Optional
from sklearn.metrics import (
classification_report,
precision_recall_fscore_support,
)
res = []
for class_p in classes:
prec, recall, fbeta_score, support = precision_recall_fscore_support(
np.array(y_true) == class_p,
np.array(y_pred) == class_p,
pos_label=True,
average=None,
)
res.append(
[
class_p,
prec[1],
recall[1],
recall[0],
fbeta_score[1],
support[1],
]
)
df_res = pd.DataFrame(
res,
columns=[
"class",
"precision",
"recall",
"specificity",
"f1-score",∏
"support",
],
)
display(df_res)