为什么我的 pytorch NN 返回 nan 张量？

Question

我有一个非常简单的神经网络，它采用扁平的 6x6 网格作为输入，并应该输出在该网格上执行的四个操作的值，因此是 1x4 张量的值。

有时，在几次运行后，由于某种原因，我会得到 nan 的 1x4 张量

tensor([[nan, nan, nan, nan]], grad_fn=<ReluBackward0>)

我的模型看起来像这样，输入暗淡为 36，输出暗淡为 4：

class Model(nn.Module):
    def __init__(self, input_dim, output_dim):
        # super relates to nn.Module so this initializes nn.Module
        super(Model, self).__init__()
        # Gridsize as input,
        # last layer needs 4 outputs because of 4 possible actions: left, right, up, down
        # output values are Q Values need activation function for those like argmax
        self.lin1 = nn.Linear(input_dim, 24)
        self.lin2 = nn.Linear(24, 24)
        self.lin3 = nn.Linear(24, output_dim)

    # function to feed the input through the net
    def forward(self, x):
        # rectified linear as activation function for the first two layers
        if isinstance(x, np.ndarray):
            x = torch.tensor(x, dtype=torch.float)

        activation1 = F.relu(self.lin1(x))
        activation2 = F.relu(self.lin2(activation1))
        output = F.relu(self.lin3(activation2))

        return output

输入为：

tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 1.0000, 0.0000, 0.0000, 0.0000,
         0.0000, 0.0000, 0.3333, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.3333,
         0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.3333, 0.0000, 0.0000, 0.0000,
         0.0000, 0.0000, 0.3333, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.6667]])

导致 nan 输出的可能原因是什么以及如何解决这些问题？

Answer 1

nan 值作为输出仅意味着训练不稳定，这可能有所有可能的原因，包括代码中的各种错误。如果您认为您的代码是正确的，您可以尝试通过降低学习率或使用梯度裁剪来解决不稳定性。

为什么我的 pytorch NN 返回 nan 张量？

问题描述投票：0回答：1

1个回答

最新问题

为什么我的 pytorch NN 返回 nan 张量？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1