当运行统计数据为“无”时，Batchnorms 强制在 torch.onnx.export 上设置为训练模式

Question

如this git issues（非常完整的描述）中所述，我正在尝试在openvino后端加载.onnx模型。但是，在设置

track_running_stats=False

时，会在训练模式下考虑 BatchNorm 层。以下是我在将火炬模型转换为 onnx 之前执行此操作的方法：

model.eval()

for child in model.children():
    if type(child)==nn.BatchNorm2d:
        child.track_running_stats = False
        child.running_mean = None
        child.running_var = None

然后，我将模型导出到 onnx :

dummy_input = torch.randn(1, 3, 200, 200, requires_grad=True)  
torch.onnx.export(model, dummy_input, model_path,  export_params=True, opset_version=16, training=torch.onnx.TrainingMode.PRESERVE)

最后，我在openvino中加载它时遇到了这个错误：

Error: Check '(node.get_outputs_size() == 1)' failed at src/frontends/onnx/frontend/src/op/batch_norm.cpp:67:
While validating ONNX node '<Node(BatchNormalization): BatchNormalization_10>':
Training mode of BatchNormalization is not supported.

正如 git 问题中提到的，我尝试查看 BatchNorm 输入/输出：

for node in onnx_model.graph.node:
    if any(("BatchNorm" in s or "bn" in s) for s in node.input) or any(("BatchNorm" in s or "bn" in s) for s in node.output):
        print('Node:',node.name)
        print(node)

所以你可以看到这些节点与 BN 相关：

Node: ReduceMean_5
input: "onnx::ReduceMean_22"
output: "onnx::BatchNormalization_23"
name: "ReduceMean_5"
op_type: "ReduceMean"
attribute {
  name: "axes"
  ints: 0
  ints: 1
  type: INTS
}
attribute {
  name: "keepdims"
  i: 0
  type: INT
}

Node: ReduceMean_9
input: "onnx::ReduceMean_26"
output: "onnx::BatchNormalization_27"
name: "ReduceMean_9"
op_type: "ReduceMean"
attribute {
  name: "axes"
  ints: 0
  ints: 1
  type: INTS
}
attribute {
  name: "keepdims"
  i: 0
  type: INT
}

Node: BatchNormalization_10
input: "input"
input: "bn1.weight"
input: "bn1.bias"
input: "onnx::BatchNormalization_23"
input: "onnx::BatchNormalization_27"
output: "input.4"
output: "29"
output: "30"
name: "BatchNormalization_10"
op_type: "BatchNormalization"
attribute {
  name: "epsilon"
  f: 9.999999747378752e-06
  type: FLOAT
}
attribute {
  name: "momentum"
  f: 0.8999999761581421
  type: FLOAT
}
attribute {
  name: "training_mode"
  i: 1
  type: INT
}

Node: BatchNormalization_13
input: "input.8"
input: "bn2.0.weight"
input: "bn2.0.bias"
input: "bn2.0.running_mean"
input: "bn2.0.running_var"
output: "input.12"
name: "BatchNormalization_13"
op_type: "BatchNormalization"
attribute {
  name: "epsilon"
  f: 9.999999747378752e-06
  type: FLOAT
}
attribute {
  name: "momentum"
  f: 0.8999999761581421
  type: FLOAT
}
attribute {
  name: "training_mode"
  i: 0
  type: INT
}

Node: ReduceMean_18
input: "onnx::ReduceMean_37"
output: "onnx::BatchNormalization_38"
name: "ReduceMean_18"
op_type: "ReduceMean"
attribute {
  name: "axes"
  ints: 0
  ints: 1
  type: INTS
}
attribute {
  name: "keepdims"
  i: 0
  type: INT
}

Node: ReduceMean_22
input: "onnx::ReduceMean_41"
output: "onnx::BatchNormalization_42"
name: "ReduceMean_22"
op_type: "ReduceMean"
attribute {
  name: "axes"
  ints: 0
  ints: 1
  type: INTS
}
attribute {
  name: "keepdims"
  i: 0
  type: INT
}

Node: BatchNormalization_23
input: "input.16"
input: "bn3.weight"
input: "bn3.bias"
input: "onnx::BatchNormalization_38"
input: "onnx::BatchNormalization_42"
output: "43"
output: "44"
output: "45"
name: "BatchNormalization_23"
op_type: "BatchNormalization"
attribute {
  name: "epsilon"
  f: 9.999999747378752e-06
  type: FLOAT
}
attribute {
  name: "momentum"
  f: 0.8999999761581421
  type: FLOAT
}
attribute {
  name: "training_mode"
  i: 1
  type: INT
}

您确实可以看到 2/3 BN 层处于训练模式 = 1 (-> True)。如何处理它以便 onnx 在 eval 模式下考虑它们，同时保持

track_running_stats=False

？

我对 Onnx 不太熟悉，而且在全球范围内还是深度学习的初学者，所以我很乐意提供任何建议！

Answer 1

我终于根据问题中链接的 GitHub 问题提出的建议找到了解决方案。

因此，将

track_running_stats

设置为 False 后，

BatchNormalization

层将被视为处于训练模式，如 Onnx 图中所示。

我已将批量归一化层中引用

mean

和

var

的未使用输出直接删除到图中，然后手动将层设置为评估模式 (

training_mode = 0

)。您必须删除未使用的输出，并且不仅将

training_mode

属性设置为

，否则检查将无法通过。

for node in onnx_model.graph.node:
    if node.op_type == "BatchNormalization":
        for attribute in node.attribute:
            if attribute.name == 'training_mode':
                if attribute.i == 1:
                    node.output.remove(node.output[1])
                    node.output.remove(node.output[1])
                attribute.i = 0

之后，我就能够正确运行推理并得到预期的结果。

当运行统计数据为“无”时，Batchnorms 强制在 torch.onnx.export 上设置为训练模式

问题描述投票：0回答：1

1个回答

最新问题

当运行统计数据为“无”时，Batchnorms 强制在 torch.onnx.export 上设置为训练模式

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1