F.cross_entropy 引发“运行时错误：CUDA 错误：设备端断言已触发。使用“TORCH_USE_CUDA_DSA”进行编译以启用设备端断言。”

Question

我在损失函数中遇到了这个错误。下面是一个例子：

import torch
device = "cuda" if torch.cuda.is_available() else "cpu"

a = torch.Tensor([[-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815],
        [-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815],
        [-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815],
        [-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815]]).to(device)
b = torch.Tensor([ -100,  -1,  -100,  2456]).long().to(device)

loss = torch.nn.functional.cross_entropy(a,b,ignore_index=-1)
print(loss)

那么错误是：

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-9-29fe2d43a573> in <cell line: 1>()
----> 1 loss = torch.nn.functional.cross_entropy(a,b,ignore_index=-1).to(device)
      2 loss

/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   3027     if size_average is not None or reduce is not None:
   3028         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3029     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
   3030 
   3031 

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

我尝试设置

export TORCH_USE_CUDA_DSA=1

但仍然出现相同的错误。

Answer 1

不应为负数。此外，

的最大值应为

a.shape[-1]

。

assert ((0 <= b) & (b < a.shape[-1])).all()

您的

不满足这些条件。

目标：如果包含类索引，则形状为 (C,)、(N,C) 或 (N,d1,d2,...,dK)，在 K 维损失的情况下，K≥1，其中 每个值应介于 [0,C) 之间。如果包含类概率，则形状与输入相同，并且每个值应在 [0,1] 之间。

来源：
F.cross_entropy

此示例改编自

F.cross_entropy

的文档：

a = torch.randn(4, 6)
b = torch.randint(6, (4,), dtype=torch.int64)
loss = F.cross_entropy(a, b)

Answer 2

将火炬版本 2.0.0+cu118 更改为 2.1.0.dev20230501+cu117：

pip3 安装 numpy --pre torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cu117

F.cross_entropy 引发“运行时错误：CUDA 错误：设备端断言已触发。使用“TORCH_USE_CUDA_DSA”进行编译以启用设备端断言。”

问题描述投票：0回答：2

2个回答

最新问题

F.cross_entropy 引发“运行时错误：CUDA 错误：设备端断言已触发。使用“TORCH_USE_CUDA_DSA”进行编译以启用设备端断言。”

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2