我正在处理 EMNIST 数据集并希望从 PyTorch 加载它,但它返回一个奇怪的错误:
运行时错误:文件未找到或已损坏。
这是我尝试加载数据集的方法:
trainset = torchvision.datasets.EMNIST(root="emnist",
split="letters",
train=True,
download=True,
transform=transforms.ToTensor())
可能出了什么问题?
我认为该链接不正确,请尝试使用该链接下载数据集:
https://github.com/Tony-Y/pytorch_warmup/blob/master/examples/emnist/download.py
然后用以下命令更改代码:
import torchvision
from torchvision import transforms
# Update the path to where you've manually placed the EMNIST dataset
root_dir = "./path/to/your/emnist" # Change this to the actual path
trainset = torchvision.datasets.EMNIST(root=root_dir,
split="letters",
train=True,
download=False, # Set to False since you already downloaded it
transform=transforms.ToTensor())
它适用于 Google Colab(您可以尝试一下 https://colab.research.google.com/):
import torchvision
import torchvision.transforms as transforms
trainset = torchvision.datasets.EMNIST(root="emnist",
split="mnist",#letters #digits
train=True,
download=True,
transform=transforms.ToTensor())
产品:
Downloading https://biometrics.nist.gov/cs_links/EMNIST/gzip.zip to emnist/EMNIST/raw/gzip.zip
100%|██████████| 562M/562M [00:10<00:00, 52.9MB/s]
Extracting emnist/EMNIST/raw/gzip.zip to emnist/EMNIST/raw