训练期间错误:布局失败:INVALID_ARGUMENT:值 0 的大小与排列 4 的大小不匹配

问题描述 投票:0回答:1

我正在使用TensorFlow训练分割模型,在训练过程中遇到错误。大约 6 秒后,训练停止并显示以下错误消息:

Epoch 1/100
2023-07-17 08:14:20.618828: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inmodel_3/dropout_15/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
8278/8278 [==============================] - 6s 198us/step - loss: 2.1831 - accuracy: 0.8421 - val_loss: 2.2880 - val_accuracy: 0.8349

我正在使用自定义数据生成器(

DataGen
)来加载和预处理输入图像和蒙版。该错误似乎与模型的布局有关,特别是
dropout
层。我不确定为什么值的大小与排列大小不匹配。我认为这可能与数据生成器有关。

我在下面包含了相关的代码片段:

# Data generator
class DataGen(tf.keras.utils.Sequence):
    def __init__(self, path_input, path_mask, class_name='person', batch_size=8, image_size=128):
        self.ids = os.listdir(path_mask)
        self.path_input = path_input
        self.path_mask = path_mask
        self.class_name = class_name
        self.batch_size = batch_size
        self.image_size = image_size
        self.on_epoch_end()

    def __load__(self, id_name):
        image_path = os.path.join(self.path_input, id_name)
        mask_path = os.path.join(self.path_mask, id_name)
        
        image = cv2.imread(image_path, 1)  # 1 specifies RGB format
        image = cv2.resize(image, (self.image_size, self.image_size))  # resizing before inserting into the network
        
        mask = cv2.imread(mask_path, -1)
        mask = cv2.resize(mask, (self.image_size, self.image_size))
        mask = mask.reshape((self.image_size, self.image_size, 1))

        # normalize image
        image = image / 255.0
        mask = mask / 255.0

        return image, mask

    def __getitem__(self, index):
        id_name = self.ids[index]
        image, mask = self.__load__(id_name)

        if image is not None and mask is not None:
            images = np.expand_dims(image, axis=0)
            masks = np.expand_dims(mask, axis=0)
        else:
            images = np.empty((self.image_size, self.image_size, 3))
            masks = np.empty((self.image_size, self.image_size, 1))

        return images, masks


    def on_epoch_end(self):
        pass

    def __len__(self):
        return len(self.ids)



# Configure model
image_size = 128
epochs = 100
batch_size = 10

# Create data generators
train_gen = DataGen(path_input="/kaggle/input/coco-2014-dataset-for-yolov3/coco2014/images/train2014",
                    path_mask="/kaggle/working/mask_train_2014",
                    batch_size=batch_size,
                    image_size=image_size)

val_gen = DataGen(path_input="/kaggle/input/coco-2014-dataset-for-yolov3/coco2014/images/val2014",
                  path_mask="/kaggle/working/mask_val_2014",
                  batch_size=batch_size,
                  image_size=image_size)

# Define model architecture
inputs = Input(shape=(128, 128, 3))
# ...

# Compile and train the model
optimizer = tf.keras.optimizers.Adam(lr=1e-4)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_gen, validation_data=val_gen, steps_per_epoch=train_steps, epochs=epochs)

任何有关如何解决此问题的见解或建议将不胜感激。

我使用的是coco2014数据集。 tf 版本“2.12.0”

python tensorflow keras deep-learning
1个回答
0
投票

我的训练师的第一个时期有一个非常相似的错误消息:

tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape insequential_2/efficientnetv2-b0/block2b_drop/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer

我了解到,就我而言,我可以忽略那个。适合例程继续工作并且工作正常......

© www.soinside.com 2019 - 2024. All rights reserved.