如何修复调度程序的“恢复优化器时未指定 initial_lr”错误？

Question

在 PyTorch 中，我这样配置 SGD：

sgd_config = {
    'params' : net.parameters(),
    'lr' : 1e-7,
    'weight_decay' : 5e-4,
    'momentum' : 0.9
}
optimizer = SGD(**sgd_config)

我的要求是：

总纪元是 100
每 30 个 epochs 学习率降低 10 倍
降低学习率将在 60 个 epochs 时停止

所以对于 100 个纪元，我的学习率会减少两倍

0.1

。

我读到了学习率调度器，它在

torch.optim.lr_scheduler

中可用所以我决定尝试使用它而不是手动调整学习率：

scheduler = lr_scheduler.StepLR(optimizer, step_size=30, last_epoch=60, gamma=0.1)

但是我越来越

Traceback (most recent call last):
  File "D:\Projects\network\network_full.py", line 370, in <module>
    scheduler = lr_scheduler.StepLR(optimizer, step_size=30, last_epoch=90, gamma=0.1)
  File "D:\env\test\lib\site-packages\torch\optim\lr_scheduler.py", line 367, in __init__
    super(StepLR, self).__init__(optimizer, last_epoch, verbose)
  File "D:\env\test\lib\site-packages\torch\optim\lr_scheduler.py", line 39, in __init__
    raise KeyError("param 'initial_lr' is not specified "
KeyError: "param 'initial_lr' is not specified in param_groups[0] when resuming an optimizer"

我在 here 上读了一篇文章，但我仍然不明白我将如何在我的场景中使用调度程序。也许我只是不理解

last_epoch

的定义，因为 documentation 对此参数非常简短：

last_epoch (int) – 最后一个纪元的索引。默认值：
-1
.

由于用户可以使用该参数，并且没有明确禁止使用比优化器本身更少的周期的调度程序，我开始认为这是一个错误。

Answer 1

您误解了 last_epoch 参数，并且您没有使用正确的学习率调度程序来满足您的要求。

这应该有效：

optim.lr_scheduler.MultiStepLR(optimizer, [0, 30, 60], gamma=0.1, last_epoch=args.current_epoch - 1)

last_epoch 参数确保在恢复训练时使用正确的 LR。它默认为 -1，因此纪元 0 之前的纪元。

Answer 2

A：您必须将 last_epoch=60 指定为单独的命令，以 diff 格式：

<< scheduler = lr_scheduler.StepLR(optimizer, step_size=30, last_epoch=60, gamma=0.1)

>> scheduler = lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
>> scheduler.last_epoch = 60

使用它来检查调度程序值：

print(scheduler.state_dict())

{'step_size': 30, 'gamma': 0.1, 'base_lrs': [0.0002], 'last_epoch': 4, '_step_count': 5, 'verbose': False, '_get_lr_called_within_step': False, '_last_lr': [0.0002]}

Answer 3

 sgd_config = {
    'params' : net.parameters(),
    **'initial_lr': 1e-7**,
    'lr' : 1e-7,
    'weight_decay' : 5e-4,
    'momentum' : 0.9
}
optimizer = SGD(**sgd_config)

Answer 4

last_epoch

必须为 -1，除非您正在恢复训练。如果您试图恢复训练，问题是您在加载优化器参数之前创建了调度程序。

恢复的正确程序：

    sgd_config = {
        'params' : net.parameters(),
        'lr' : 1e-7,
        'weight_decay' : 5e-4,
        'momentum' : 0.9
    }
    optimizer = SGD(**sgd_config)
    optimizer.load_state_dict(torch.load('your_save_optimizer_params.pt'))
    scheduler = lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1, last_epoch=50)

如何修复调度程序的“恢复优化器时未指定 initial_lr”错误？

问题描述投票：0回答：4

4个回答

最新问题

如何修复调度程序的“恢复优化器时未指定 initial_lr”错误？

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4