在pytorch中复制https://www.d2l.ai/chapter_linear-networks/linear-regression-scratch.html

Question

我正在尝试在pytorch中复制代码。但是我在使用autograd函数时遇到了一些问题。我遇到以下运行时错误。

RuntimeError：尝试第二次向后浏览图形

代码如下：

for epoch in range(num_epochs):
    # Assuming the number of examples can be divided by the batch size, all
    # the examples in the training data set are used once in one epoch
    # iteration. The features and tags of mini-batch examples are given by X
    # and y respectively
    for X, y in data_iter(batch_size, features, labels):
        print (X)
        print (y)
        l = loss(net(X,w,b) , y)
        print (l)
        l.backward(retain_graph=True)
        print (w.grad)
        print (b.grad)

        with torch.no_grad():
          w -= w.grad * 1e-5/batch_size
          b -= b.grad * 1e-5/batch_size 
          w.grad.zero_()
          b.grad.zero_()

有人可以解释autograd如何在python中工作吗？如果有人可以推荐我一个很好的学习pytorch的资源，那将是很棒的。

Answer 1

Pytorch的动态计算图与Tensorflow完全不同。为了节省内存，Pytorch将删除grpah中所有不再使用的中间节点。也就是说，如果您想通过这些中间节点反向传播渐变两次或更多次，则会遇到麻烦。

简单的解决方法是设置retain_graph=True。例如，

model = Autoencoder()
rec = model(x)
loss_1 = mse_loss(rec, x)
loss_2 = l1_loss(rec, x)

opt.zero_grad()
loss_1.backward(retain_graph=True)
loss_2.backward()
opt.step()

在pytorch中复制https://www.d2l.ai/chapter_linear-networks/linear-regression-scratch.html

问题描述投票：0回答：1

1个回答

最新问题

在pytorch中复制https://www.d2l.ai/chapter_linear-networks/linear-regression-scratch.html

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1