我想对来自更大的公共数据集的边界框进行初步训练 - 五十一个看起来是一个不错的起点,但我在让它与 pytorch 一起工作时遇到了一些问题 - 我认为这一定是一些小问题我在这里失踪了,我尝试关注官方github。 我开始怀疑 this github 中的示例已经过时了,因为即使我尝试使用示例类,它也会失败。问题在于访问方法,其中唯一的准数字选项是
dataset.first()
、dataset.last()
问题: 数据加载器抛出异常
KeyError: 'Accessing samples by numeric index is not supported. Use sample IDs, filepaths, slices, boolean arrays, or a boolean ViewExpression instead'
class FiftyOneDS(torch.utils.data.Dataset):
def __init__(
self
, fiftyone_ds
, transforms = None
, gt_field = "ground_truth"
, classes = None
):
self.samples = fiftyone_ds
self.transforms = transforms
self.gt_field = gt_field
self.classes = classes # don't care
self.img_paths = self.samples.values("filepath")
def __getitem__(self, idx):
img_path = self.img_paths[idx]
sample = self.samples[idx]
metadata = sample.metadata
img = Image.open(img_path).convert("RGB")
boxes = []
labels = []
detections = sample[self.gt_field].detections
for det in detections:
if det["label"] != "car":
continue
category_id = self.labels_map_rev[det.label]
coco_obj = fouc.COCOObject.from_label(
det, metadata, category_id=category_id,
)
x, y, w, h = coco_obj.bbox
boxes.append([x, y, x + w, y + h])
labels.append(coco_obj.category_id)
target = {}
target["boxes"] = torch.as_tensor(boxes, dtype=torch.float32)
target["labels"] = torch.as_tensor(labels, dtype=torch.int64)
target["image_id"] = torch.as_tensor([idx])
if self.transforms is not None:
img, target = self.transforms(img, target)
return img, target
def __len__(self):
return len(self.img_paths)
然后通过片段使用数据集(这似乎工作正常):
carset = FiftyOneDS(dataset)
print("type:", type(torch_dataset_test))
# type: <class 'fiftyone.core.view.DatasetView'>
print("first elem:", torch_dataset_test[0])
# KeyError: 'Accessing samples by numeric index is not supported. Use sample IDs, filepaths, slices, boolean arrays, or a boolean ViewExpression instead'
如何重写我的数据集类以与 pytorch 数据加载器一起使用?
FiftyOne 不支持索引切片,如错误所示。解决方案可以改为传入文件路径。
sample = self.samples[img_path]
应该可以解决问题。在这里阅读更多内容https://docs.voxel51.com/user_guide/using_views.html#slicing