我有一个大小为 m x n 的矩阵,并且想要通过 1 x n 向量(具有网络结构的图片中的 x)来预测整个下一个 (m-1) x n 矩阵(图片中的 y^{i} ),使用RNN 或 LSTM,我不明白如何实现各自的喂养 1 x n 向量到下一个隐藏状态并得到所有 同时计算 (m-1) x n 个向量以及如何计算所有 y^{i}
的误差我有这个普通的 RNN 模型,但不知道如何修改它
class RNNModel(nn.Module):
def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
super(RNNModel, self).__init__()
self.hidden_dim = hidden_dim
self.layer_dim = layer_dim
# (batch_dim, seq_dim, feature_dim)
self.RNN = nn.RNN(input_dim, hidden_dim, layer_dim, batch_first=True, nonlinearity='tanh')
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
# Initialize hidden state with zeros
h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
out, h_t = self.RNN(x, h0)
#out = self.fc(h_t[:, -1, :])
out = self.fc(out[:, -1, :])
return out
您可以通过按序列长度重复输入张量来将一对多构建为多对多。
例如使用
torch.nn.LSTM
lstm = torch.nn.LSTM(
input_size=self.x_dim,
hidden_size=self.out_dim,
num_layers=lstm_layers,
batch_first=True,
)
out = x.unsqueeze(1).repeat(1, seq_len, 1) # (n_batches, L, H_in)
out, _ = self.lstm(out)
斯坦利,试试这个: 仅当没有传递其他隐藏状态时才启动隐藏状态。然后返回隐藏状态并在下一次迭代时将其传递给
forward()
。
def forward(self, x, h=None):
if h is None: # if no hidden state is passed
h = torch.zeros( # Initialize hidden state with zeros
self.layer_dim, x.size(0),
self.hidden_dim).requires_grad_()
out, h_t = self.RNN(x, h)
out = self.fc(out[:, -1, :])
return out, h_t
在训练代码中,您可以像这样运行循环:
x = seed
h = None
for i in range (...)
optimizer.zero_grad()
...
x, h = model.forward (x, h)
...
loss = ...
loss.backward()
optimizer.step()