运行时错误:尝试在损失张量上第二次向后遍历图表

问题描述 投票:0回答:1

我有以下训练代码。我很确定我只调用了一次

loss.backward()
,但我从标题中得到了错误。我做错了什么?请注意,
X_train_tensor
是另一个图形计算的输出,因此它具有
required_grad=True
,正如您在打印语句中看到的那样。这是问题的根源吗?如果是,我该如何改变它?它不允许我直接在张量上切换它。

for iter in range(max_iters):
start_ix = 0
loss = None

while start_ix < len(X_train_tensor):
    loss = None
    end_ix = min(start_ix + batch_size, len(X_train_tensor))
    out, loss, accuracy = model(X_train_tensor[start_ix:end_ix], y_train_tensor[start_ix:end_ix])

    # every once in a while evaluate the loss on train and val sets
    if (start_ix==0) and (iter % 10 == 0 or iter == max_iters - 1):
        out_val, loss_val, accuracy_val = model(X_val_tensor, y_val_tensor)
        print(f"step {iter}: train loss={loss:.2f} train_acc={accuracy:.3f} | val loss={loss_val:.2f} val_acc={accuracy_val:.3f}  {datetime.datetime.now()}")


    optimizer.zero_grad(set_to_none=True)
    print (iter, start_ix, X_train_tensor.requires_grad, y_train_tensor.requires_grad, loss.requires_grad)
    loss.backward()
    optimizer.step()
    start_ix = end_ix + 1

这是错误:

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

更新:这是模型输入张量的来源,作为其他(自动编码器)模型的输出:

autoencoder.eval()
with torch.no_grad(): # it seems like adding this line solves the problem?
    X_train_encoded, loss = autoencoder(X_train_tensor)
    X_val_encoded, loss = autoencoder(X_val_tensor)
    X_test_encoded, loss = autoencoder(X_test_tensor)

添加上面的

with torch.no_grad()
行解决了问题,但我不明白为什么。它实际上改变了输出的生成方式吗?它是如何工作的?

python deep-learning pytorch tensor autograd
1个回答
0
投票

据我了解,

X_train_tensor
是自动编码器的输出。当您在编码步骤期间不运行
torch.no_grad()
时,将为自动编码器的输出创建一个计算图,它将自动编码器的操作和权重链接到编码的张量。在您的代码中,由于模型的输出使用
X_train_tensor
,因此模型的损失连接到自动编码器的计算图。

当您第一次调用

loss.backward()
时,PyTorch 会遍历整个计算图(包括自动编码器)来计算梯度,然后清除图。当您在循环的第二次迭代中调用
loss.backward()
时,您正在尝试遍历已清除的自动编码器的计算图。

torch.no_grad()
阻止 PyTorch 创建自动编码器计算图或将生成的损失链接到自动编码器。

© www.soinside.com 2019 - 2024. All rights reserved.