MLmetrics F1_Score 函数

Question

我试图弄清楚当 y_pred 值是非二进制时，MLmetrics 库中的 F1_Score 函数如何工作。

例如：

library(MLmetrics)
y <- c(1,1,1,1,1,0,0,0,0,0)
x <- c(1, 0.8, 0.654, 0.99, 0.75, 0.1, 0.3, 0.6, 0.05, 0.2)
x_preds <- ifelse(x < 0.5, 0, 1)
getF1 <- F1_Score(y_true=y, y_pred=x, positive="1")
getF2 <- F1_Score(y_true=y, y_pred=x_preds, positive="1")

print(getF1)
print(getF2)

给出 getF1=0.3333333 和 getF2 = 0.9090909

R 文档中提供的函数示例旨在计算我所说的 getF2，其中我准确指定了如何基于 0.5 阈值将概率分数分配给任一类标签。我不清楚的是，如果未指定此阈值（getF1），它如何计算 F1 分数。如果您将概率分数保留原样并且在调用 F1_Score 函数之前不将它们转换为二进制，谁能解释一下该函数默认执行的操作吗？我怎么也搞不懂它是怎么得到 0.3333333 的。

谢谢！

Answer 1

这是简化后的函数：

f1_score <- function(y_true, y_pred, positive = '1'){
  tt <- table(y_true, y_pred)
  TP <- tt[positive, positive]
  FP <- tt[rownames(tt)!=positive, positive]
  FN <- sum(tt[positive, colnames(tt) !=positive])
  precision <- TP/(TP+FP)
  recall <- TP/(TP+FN)
  2 * (precision * recall) / (precision + recall)
}

f1_score(y, x, "1")
[1] 0.3333333

f1_score(y, x_preds, "1")
[1] 0.9090909

注意

FP

是如何不求和的，因为我们假设

y_true

中只有两个类别。如果类别较多，即非二元，则使用

FN <- sum(....)

MLmetrics F1_Score 函数

问题描述投票：0回答：1

1个回答

最新问题

MLmetrics F1_Score 函数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1