如何统计Detectron2中语义分割每类的像素总数

Question

我想计算每个分段类别的像素总数，我只需要每个一般对象的计数，例如每辆车一个类别，每个人一个类别等等。因此，我使用语义分割而不是实例分割（它会单独考虑每个车辆或人实例）。但是 detectorron2 中语义分割的输出没有二进制掩码。

我知道实例分割的输出是二进制掩码，可以使用以下代码获取像素计数：

masks = output['instances'].pred_masks 
results = torch.sum(torch.flatten(masks, start_dim=1),dim=1)

这给出了像素数，但单独考虑了我不想要的每个车辆实例。但是语义分割的输出是字段“sem_seg”，其中包含每个一般类的预测类概率，而不是二进制掩码，我如何继续获取语义分割中每个类的像素计数？

Answer 1

虽然问题已经提出 7 个月了，但仍在回答您或其他人可能需要的问题

正如 Christoph Rackwit 所提到的，为了对实例进行求和，我将使用相同的方法来计算像素总数，以及您提到的用于查找每个实例的总像素的代码，这些代码源自实例分割

所以首先我们提取预测类，它基本上是索引值（pred_classes）和预测掩码（pred_mask），这两个张量值的顺序是相同的
然后我们迭代 pred_classes 和 pred_masks 并使用字典将特定实例的先前像素总和添加到当前
然后将用于训练的类名存储在列表中
最后使用类列表和 pred_classes 索引值迭代字典，我们找到每个类的像素总数（包含多个实例）

import locale

pre_classes = MetadataCatalog.get(self.cfg.DATASETS.TRAIN[0]).thing_classes # this contains the class names in the same order used for training, replace it with your custom dataset class names
masks = predictions["instances"].pred_masks # this extracts the pred_masks from the predicitons
classes = predictions["instances"].pred_classes.numpy().tolist() # this extracts the pred_classes (contains index values corresponding to pre_classes) to a simple from the predicitons

results = torch.sum(torch.flatten(masks, start_dim=1), dim=1).numpy().tolist() # this calculates the total pixels of each instance

count = dict() # create a dict to store unique classes and their total pixel
for i in range(len(classes)): # itearte over the predicted classes
    count[classes[i]] = count.get(classes[i], 0) + results[i] # add the current sum of pixel of particular class and instance to the previous sum of the same class, adds 0 if the class didnt already exist 

locale.setlocale(locale.LC_ALL, 'en_IN.UTF-8') # set the locale to Indian format
for k, v in count.items(): # itearte over the dict
    print(f"{pre_classes[k]} class contains {locale.format_string('%d', v, grouping=True)} pixels") # printing each class and its total pixel, pres_classes[k] for accessing corresponding class names for class index, locale.format_string for formating the raw number to Indian format

我使用预定义的模型对下面的图像进行实例分割

产生了以下图像

这也产生了您所需的结果

dog class contains 1,39,454 pixels
cat class contains 95,975 pixels

由于我没有任何语义分割的实际经验，因此使用实例分割提供了解决方案，但是如果您坚持使用语义分割实现相同的目标，请提供权重、推理方法和测试数据集，以便我可以帮助你

无论如何，我希望这就是您正在寻找的内容，任何与代码、逻辑或工作相关的问题，请随时联系

如何统计Detectron2中语义分割每类的像素总数

问题描述投票：0回答：1

1个回答

最新问题

如何统计Detectron2中语义分割每类的像素总数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1