我最近开始学习神经网络的工作原理,并想开始训练它们。我的第一个项目是两个训练神经网络来添加两个数字。通过一些例子,我已经到达这里并实现了低损失。我想生成新的数字并将它们传递给模型以查看结果是什么,并开始给它两个数字以查看它的预测结果。这是我所拥有的。
\`torch.manual_seed(42)
N = 1000 # number of samples
D = 2 # Input dimension
C = 1 # Output dimension
lr = 1e-2 # learning rate
X = torch.rand(N, D) # 1000 numbers of 2 dims
y = torch.sum(X, axis=-1).reshape(-1, C) # This is summing X rows and reshaping it to 1 output dimension
# print(X\[:50\])
# print(y\[:50\])
model = torch.nn.Sequential(torch.nn.Linear(D, C))
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
for i in range(500):
y_pred = model(X)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
optimizer.zero_grad()
if i % 50 == 0:
print(i)
print('---')
print(loss)
idx = torch.tensor(\[3, 4\])
for i in range(20):
n = model.generate(idx, 50)
print(n)
我无法从该模型生成,并且不确定如何生成。
要使用模型生成输出,您只需在输入上调用模型即可
X
:
y_pred = model(X)
脚本的示例实现可以是:
torch.manual_seed(42)
N = 1000 # number of samples
D = 2 # Input dimension
C = 1 # Output dimension
lr = 1e-1 # learning rate
X = torch.rand(N, D) # 1000 numbers of 2 dims
y = torch.sum(X, axis=-1).reshape(-1, C) # This is summing X rows and reshaping it to 1 output dimension
print(f"X.shape: {X.shape}, y.shape:{y.shape}")
print(f"X[:5]: {X[:5]}")
print(f"y[:5]: {y[:5]}")
model = torch.nn.Sequential(torch.nn.Linear(D, C))
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
print("\nTraining model")
for i in range(500):
y_pred = model(X)
loss = criterion(y_pred, y)
loss.backward()
optimizer.step()
optimizer.zero_grad()
if i % 50 == 0:
print(f"Epoch: {i+1:<5} loss: {loss.item()}")
print("\nTesting trained model on new random numbers")
for i in range(5):
X = torch.rand(1, D)
y_pred = model(X)
print(f"{X[:, 0].item():.2f} + {X[:, 1].item():.2f} = {X.sum().item():.2f}, predicted: {y_pred.item():.2f}")
print(f"\nModel learned weights and biases\n{model.state_dict()}")
这是上述实现的输出:
X.shape: torch.Size([1000, 2]), y.shape:torch.Size([1000, 1])
X[:5]: tensor([[0.8823, 0.9150],
[0.3829, 0.9593],
[0.3904, 0.6009],
[0.2566, 0.7936],
[0.9408, 0.1332]])
y[:5]: tensor([[1.7973],
[1.3422],
[0.9913],
[1.0502],
[1.0740]])
Training model
Epoch: 1 loss: 1.1976375579833984
Epoch: 51 loss: 0.013810846023261547
Epoch: 101 loss: 1.6199066521949135e-05
Epoch: 151 loss: 2.2864621485041425e-07
Epoch: 201 loss: 1.5489435289950393e-09
Epoch: 251 loss: 1.0740366929162803e-11
Epoch: 301 loss: 5.607882254811229e-14
Epoch: 351 loss: 4.732325513424554e-18
Epoch: 401 loss: 1.3877788466974007e-20
Epoch: 451 loss: 1.3877788466974007e-20
Testing trained model on new random numbers
0.29 + 0.47 = 0.77, predicted: 0.77
0.15 + 0.45 = 0.60, predicted: 0.60
0.57 + 0.48 = 1.05, predicted: 1.05
0.31 + 0.65 = 0.96, predicted: 0.96
0.37 + 0.22 = 0.59, predicted: 0.59
Model learned weights and biases
OrderedDict([('0.weight', tensor([[1., 1.]])), ('0.bias', tensor([2.3689e-09]))])
注意:我将学习率从
1e-2
更改为1e-1
,以便模型优化收敛到较低的训练损失。
此外,您还可以验证训练后的模型具有权重
[[1., 1.]]
和偏差 [2.3689e-09]
,这意味着模型已经学会了将两个数字相加为 y_pred = x_0 * 1 + x_1 * 1 + 2.3689e-09
。