我在损失函数中遇到了这个错误。下面是一个例子:
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
a = torch.Tensor([[-10.3353, -28.4371, 2.0768, -4.2789, -8.6644, -6.0815],
[-10.3353, -28.4371, 2.0768, -4.2789, -8.6644, -6.0815],
[-10.3353, -28.4371, 2.0768, -4.2789, -8.6644, -6.0815],
[-10.3353, -28.4371, 2.0768, -4.2789, -8.6644, -6.0815]]).to(device)
b = torch.Tensor([ -100, -1, -100, 2456]).long().to(device)
loss = torch.nn.functional.cross_entropy(a,b,ignore_index=-1)
print(loss)
那么错误是:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-9-29fe2d43a573> in <cell line: 1>()
----> 1 loss = torch.nn.functional.cross_entropy(a,b,ignore_index=-1).to(device)
2 loss
/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
3027 if size_average is not None or reduce is not None:
3028 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3029 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
3030
3031
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
我尝试设置
export TORCH_USE_CUDA_DSA=1
但仍然出现相同的错误。
b
不应为负数。此外,b
的最大值应为a.shape[-1]
。
assert ((0 <= b) & (b < a.shape[-1])).all()
您的
b
不满足这些条件。
目标:如果包含类索引,则形状为 (C,)、(N,C) 或 (N,d1,d2,...,dK),在 K 维损失的情况下,K≥1,其中 每个值应介于 [0,C) 之间。如果包含类概率,则形状与输入相同,并且每个值应在 [0,1] 之间。
F.cross_entropy
F.cross_entropy
的文档:
a = torch.randn(4, 6)
b = torch.randint(6, (4,), dtype=torch.int64)
loss = F.cross_entropy(a, b)
将火炬版本 2.0.0+cu118 更改为 2.1.0.dev20230501+cu117:
pip3 安装 numpy --pre torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cu117