所以我正在训练这个
small CNN model which has few Conv2D layers and some MaxPool2D, Activations, Dense
,基本上是Tensorflow
提供的基本层。
我想要它
run on an embedded system
,它没有很多空间,并且不能进行浮点计算。
因此,我尝试使用该模型进行 QAT 训练(量化感知训练),以便权重(最终)量化为 8 位,并且我正在使用
tfmot.quantization.keras.QuantizeWrapperV2
。
我无法理解参数计数和它对每种类型的层执行的操作(数学上),并且我希望获得一些帮助来理解此 API 给出的未记录的数学运算。
下面是同一模型的摘要,一次带 QAT,一次不带。
Param #
列上有一个我无法理解的差异。
感谢您的帮助。
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 56, 56, 1)] 0
conv2d_3 (Conv2D) (None, 54, 54, 30) 300
activation_3 (Activation) (None, 54, 54, 30) 0
max_pooling2d_2 (MaxPooling (None, 27, 27, 30) 0
2D)
conv2d_4 (Conv2D) (None, 25, 25, 16) 4336
activation_4 (Activation) (None, 25, 25, 16) 0
max_pooling2d_3 (MaxPooling (None, 12, 12, 16) 0
2D)
conv2d_5 (Conv2D) (None, 10, 10, 16) 2320
activation_5 (Activation) (None, 10, 10, 16) 0
global_average_pooling2d_1 (None, 16) 0
(GlobalAveragePooling2D)
dense (Dense) (None, 8) 136
activation_6 (Activation) (None, 8) 0
dense_1 (Dense) (None, 1) 9
activation_7 (Activation) (None, 1) 0
=================================================================
Total params: 7,101
Trainable params: 7,101
Non-trainable params: 0
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 56, 56, 1)] 0
quantize_layer_1 (QuantizeL (None, 56, 56, 1) 3
ayer)
quant_conv2d_3 (QuantizeWra (None, 54, 54, 30) 301
pperV2)
quant_activation_3 (Quantiz (None, 54, 54, 30) 3
eWrapperV2)
quant_max_pooling2d_2 (Quan (None, 27, 27, 30) 1
tizeWrapperV2)
quant_conv2d_4 (QuantizeWra (None, 25, 25, 16) 4337
pperV2)
quant_activation_4 (Quantiz (None, 25, 25, 16) 3
eWrapperV2)
quant_max_pooling2d_3 (Quan (None, 12, 12, 16) 1
tizeWrapperV2)
quant_conv2d_5 (QuantizeWra (None, 10, 10, 16) 2321
pperV2)
quant_activation_5 (Quantiz (None, 10, 10, 16) 3
eWrapperV2)
quant_global_average_poolin (None, 16) 3
g2d_1 (QuantizeWrapperV2)
quant_dense (QuantizeWrappe (None, 8) 137
rV2)
quant_activation_6 (Quantiz (None, 8) 3
eWrapperV2)
quant_dense_1 (QuantizeWrap (None, 1) 14
perV2)
quant_activation_7 (Quantiz (None, 1) 1
eWrapperV2)
=================================================================
Total params: 7,131
Trainable params: 7,101
Non-trainable params: 30
_________________________________________________________________
QuantizeWrapperV2
量化其所包装的 keras 层的权重和激活。这涉及将高精度权重和激活转换为较低精度的格式。当您应用 QAT
时,层会被修改以支持训练期间的量化,这会导致管理量化过程的参数数量略有增加。要了解更多信息,请参阅此文档。