我正在为 NLP 训练我的暹罗网络。我在里面使用了lstm。和 BCE 损失。我的损失从第一个纪元开始增加。前 36 个 epoch 损失是
error after 0 is
272.4357
[torch.FloatTensor of size 1]
error after 1 is
271.8972
[torch.FloatTensor of size 1]
error after 2 is
271.5598
[torch.FloatTensor of size 1]
error after 3 is
271.6979
[torch.FloatTensor of size 1]
error after 4 is
271.7315
[torch.FloatTensor of size 1]
error after 5 is
272.3965
[torch.FloatTensor of size 1]
error after 6 is
273.3982
[torch.FloatTensor of size 1]
error after 7 is
275.1197
[torch.FloatTensor of size 1]
error after 8 is
275.8228
[torch.FloatTensor of size 1]
error after 9 is
278.3311
[torch.FloatTensor of size 1]
error after 10 is
277.1054
[torch.FloatTensor of size 1]
error after 11 is
277.8418
[torch.FloatTensor of size 1]
error after 12 is
279.0189
[torch.FloatTensor of size 1]
error after 13 is
278.4090
[torch.FloatTensor of size 1]
error after 14 is
281.8813
[torch.FloatTensor of size 1]
error after 15 is
283.4077
[torch.FloatTensor of size 1]
error after 16 is
286.3093
[torch.FloatTensor of size 1]
error after 17 is
287.6292
[torch.FloatTensor of size 1]
error after 18 is
297.2318
[torch.FloatTensor of size 1]
error after 19 is
307.4176
[torch.FloatTensor of size 1]
error after 20 is
304.6649
[torch.FloatTensor of size 1]
error after 21 is
328.9772
[torch.FloatTensor of size 1]
error after 22 is
300.0669
[torch.FloatTensor of size 1]
error after 23 is
292.3902
[torch.FloatTensor of size 1]
error after 24 is
300.8633
[torch.FloatTensor of size 1]
error after 25 is
305.1822
[torch.FloatTensor of size 1]
error after 26 is
333.9984
[torch.FloatTensor of size 1]
error after 27 is
346.2062
[torch.FloatTensor of size 1]
error after 28 is
354.6148
[torch.FloatTensor of size 1]
error after 29 is
341.3568
[torch.FloatTensor of size 1]
error after 30 is
369.7580
[torch.FloatTensor of size 1]
error after 31 is
366.1615
[torch.FloatTensor of size 1]
error after 32 is
368.2455
[torch.FloatTensor of size 1]
error after 33 is
391.4102
[torch.FloatTensor of size 1]
error after 34 is
394.3190
[torch.FloatTensor of size 1]
error after 35 is
401.0990
[torch.FloatTensor of size 1]
error after 36 is
422.3723
[torch.FloatTensor of size 1]
可能你的学习率太高了。尝试降低学习率。学习率太大是损失从第一个时期开始增加的最常见原因。
而且你的损失也很大。如此高的损失是不寻常的。您的损失函数中可能有一个总和,用均值替换该总和可能更明智。虽然如果您使用 Adam 优化器,这没有什么区别,但如果您使用带有或不带有动量的简单 SGD(使用总和而不是均值),则意味着您将需要以不同方式调整学习率,如果尺寸(或序列的长度)由您的 lstm 处理)您的系统更改。