多类别的 mIoU

Question

我想了解多类分类的 mIoU 是如何计算的。每个类别的公式为

然后对各个类别进行平均以获得 mIoU。但是，我不明白未代表的班级会发生什么。该公式变为除以 0，因此我忽略它们，并且仅针对所表示的类计算平均值。

问题在于，当预测错误时，准确率确实会降低。它添加了另一个类来求平均值。例如：在语义分割中，图像的真实值由 4 个类别 (0,1,2,3) 组成，并且在数据集上表示 6 个类别。预测也由 4 个类别（0、1、4、5）组成，但所有分类为 2 和 3（在真实情况中）的项目都被分类为 4 和 5（在预测中）。在这种情况下，我们应该计算 6 个类别的 mIoU 吗？即使 4 个类完全错误并且各自的 IoU 为 0 ？所以问题是，如果在不在 ground_truth 中的类中只预测一个像素，我们必须除以更高的分母，这会大大降低分数。

这是计算多类（和语义分割）mIoU 的正确方法吗？

Answer 1

我不是计算每个图像的 miou，然后计算所有图像的“平均”miou，而是将 miou 作为一张大图像来计算。如果一个类不在图像中并且没有被预测，我将相应的 iou 设置为 1。

从头开始：

def miou(gt,pred,nbr_mask):
    intersection = np.zeros(nbr_mask) # int = (A and B)
    den = np.zeros(nbr_mask) # den = A + B = (A or B) + (A and B)
    for i in range(len(gt)):
        for j in range(height):
            for k in range(width):
                if pred[i][j][k]==gt[i][j][k]:
                    intersection[gt[i][j][k]]+=1
                den[pred[i][j][k]] += 1
                den[gt[i][j][k]] += 1
    mIoU = 0
    for i in range(nbr_mask):
        if den[i]!=0:
            mIoU+=intersection[i]/(den[i]-intersection[i])
        else:
            mIoU+=1
    mIoU=mIoU/nbr_mask
    return mIoU

使用

gt

地面实况标签数组和

pred

相关图像的预测（必须在数组中对应且大小相同）。

Answer 2

这个答案真正错误的是：

classwise_IOU = i/u # tensor of size (num_classes)
mIOU = i.sum()/u.sum() # mean IOU, taking (i/u).mean() is wrong

(i/u).mean()

是正确的！即使在 pu239 链接的代码中你也可以找到

iou_class = intersection_meter.sum / (union_meter.sum + 1e-10)
mIoU = np.mean(iou_class)

Answer 3

除了之前的答案之外，这是一个快速高效的 pytorch GPU 实现，用于计算一批大小的 mIOU 和类 IOU（pred mask 和 labels），摘自 NeurIPS 2021 论文

“Few-通过循环一致变压器进行镜头分割”

，github 存储库可用此处。 (N, H, W)

使用示例：

def intersectionAndUnionGPU(output, target, K, ignore_index=255): # 'K' classes, output and target sizes are N or N * L or N * H * W, each value in range 0 to K - 1. assert (output.dim() in [1, 2, 3]) assert output.shape == target.shape output = output.view(-1) target = target.view(-1) output[target == ignore_index] = ignore_index intersection = output[output == target] area_intersection = torch.histc(intersection, bins=K, min=0, max=K-1) area_output = torch.histc(output, bins=K, min=0, max=K-1) area_target = torch.histc(target, bins=K, min=0, max=K-1) area_union = area_output + area_target - area_intersection return area_intersection, area_union, area_target

希望对大家有帮助！

（存储库中也提供了非 GPU 实现！）

编辑（1/11/23）：根据 Jonas 的回答，将示例用法中的

output = torch.rand(4, 5, 224, 224) # model output; batch size=4; channels=5, H,W=224
preds = F.softmax(output, dim=1).argmax(dim=1) # (4, 224, 224)
labels = torch.randint(0,5, (4, 224, 224))

i, u, _ = intersectionAndUnionGPU(preds, labels, 5) # 5 is num_classes

classwise_IOU = i/u # tensor of size (num_classes) 
mIOU = (i/u).mean() # mean IOU, taking (i/u).mean() is wrong

更改为

i.sum()/u.sum()

（功能未更改）。我做了一些挖掘，虽然我从未能够通过可信来源找到 mIoU 的具体定义，但似乎

(i/u).mean()

库已经基于

huggingface

的实现（

huggingface PR

）以这种方式实现了它）。请注意，即使需要，mmsegmentation 的实现也不会返回平均 IoU；相反，它返回 IoU 的数组/张量 (

mmsegmentation

)。只有

i/u

PR 在此之上使用了

huggingface

。尽管我现在明白为什么应该这样做。

多类别的 mIoU

问题描述投票：0回答：3

3个回答

最新问题

多类别的 mIoU

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3