F.cross_entropy 引发“运行时错误:CUDA 错误:设备端断言已触发。使用“TORCH_USE_CUDA_DSA”进行编译以启用设备端断言。”

问题描述 投票:0回答:2

我在损失函数中遇到了这个错误。下面是一个例子:

import torch
device = "cuda" if torch.cuda.is_available() else "cpu"

a = torch.Tensor([[-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815],
        [-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815],
        [-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815],
        [-10.3353, -28.4371,   2.0768,   -4.2789,  -8.6644,  -6.0815]]).to(device)
b = torch.Tensor([ -100,  -1,  -100,  2456]).long().to(device)

loss = torch.nn.functional.cross_entropy(a,b,ignore_index=-1)
print(loss)

那么错误是:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-9-29fe2d43a573> in <cell line: 1>()
----> 1 loss = torch.nn.functional.cross_entropy(a,b,ignore_index=-1).to(device)
      2 loss

/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   3027     if size_average is not None or reduce is not None:
   3028         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3029     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
   3030 
   3031 

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

我尝试设置

export TORCH_USE_CUDA_DSA=1
但仍然出现相同的错误。

python pytorch loss-function
2个回答
0
投票

b
不应为负数。此外,
b
的最大值应为
a.shape[-1]

assert ((0 <= b) & (b < a.shape[-1])).all()

您的

b
不满足这些条件。

目标:如果包含类索引,则形状为 (C,)、(N,C) 或 (N,d1,d2,...,dK),在 K 维损失的情况下,K≥1,其中 每个值应介于 [0,C) 之间。如果包含类概率,则形状与输入相同,并且每个值应在 [0,1] 之间。

来源:

F.cross_entropy


此示例改编自

F.cross_entropy
的文档:

a = torch.randn(4, 6)
b = torch.randint(6, (4,), dtype=torch.int64)
loss = F.cross_entropy(a, b)

0
投票

将火炬版本 2.0.0+cu118 更改为 2.1.0.dev20230501+cu117:

pip3 安装 numpy --pre torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cu117

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.