我正在使用 TensorFlow 2.2 和 Keras 训练二进制检测架构。以前,如果我将数据加载到与模型训练相同的脚本中,我就可以正常工作。然而,当我使用更大的数据集(x6 更多样本,正负样本的比例相同)时,我现在得到这组错误(它运行了几个 epoch 5-10 (我运行了多次),然后才给出此错误):
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (dense_1/Sigmoid:0) = ] [[[nan][nan][nan]]...] [y (Cast_4/x:0) = ] [0]
[[{{node assert_greater_equal/Assert/AssertGuard/else/_1/Assert}}]]
[[gradient_tape/point_conv_fp_1/ScatterNd/_192]]
(1) Invalid argument: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (dense_1/Sigmoid:0) = ] [[[nan][nan][nan]]...] [y (Cast_4/x:0) = ] [0]
[[{{node assert_greater_equal/Assert/AssertGuard/else/_1/Assert}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_14820]
这是架构:
这是与出现错误的层相关的代码:
# initialisation
..
# point_conv_sa layers
..
self.dense4 = keras.layers.Dense(128, activation=tf.nn.elu)
self.bn4 = keras.layers.BatchNormalization()
self.dropout4 = keras.layers.Dropout(0.5)
# This line corresponds to 'dense_1' in the image
self.dense_fin = keras.layers.Dense(self.num_classes, activation=tf.nn.sigmoid, bias_initializer=self.initial_bias)
# training step
..
# point_conv_fp layers
..
net = self.dense4(points)
net = self.bn4(net)
net = self.dropout4(net)
pred = self.dense_fin(net)
return pred
这与我使用的损失函数有关吗?我使用了 keras.losses.BinaryCrossentropy() 并且对于小型和大型数据集都没有问题。然后我根据https://github.com/mkocabas/focal-loss-keras更改为焦点损失,但对于大数据集失败了:
def focal_loss(gamma=2., alpha=.25):
def focal_loss_fixed(y_true, y_pred):
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
return -K.mean(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1)) - K.mean((1 - alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
return focal_loss_fixed
....
model.compile(
optimizer=keras.optimizers.Adam(config['lr']),
loss = focal_loss(alpha=config['fl_alpha'], gamma=config['fl_gamma']),
metrics=[Precision(),
Recall(),
AUC()]
)
如果需要更多信息,请告诉我。
干杯
更新到tensorflow版本2.10应该可以正常工作。 https://github.com/keras-team/keras/issues/15715#issuecomment-1100795008