使用 ImageDataGenerator 训练 CNN,但在第二个 epoch 后训练失败

问题描述 投票:0回答:1

我正在使用 ImageDataGenerator 训练 CNN,并遇到了这个问题,在第二个纪元之后出现属性错误。

型号如下

型号

import tensorflow as tf
from tensorflow.keras.optimizers import RMSprop

def create_model():
  '''Creates a CNN with 4 convolutional layers'''
  model = tf.keras.models.Sequential([
      tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
      tf.keras.layers.MaxPooling2D(2, 2),
      tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
      tf.keras.layers.MaxPooling2D(2,2),
      tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
      tf.keras.layers.MaxPooling2D(2,2),
      tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
      tf.keras.layers.MaxPooling2D(2,2),
      tf.keras.layers.Flatten(),
      tf.keras.layers.Dense(512, activation='relu'),
      tf.keras.layers.Dense(1, activation='sigmoid')
  ])

  model.compile(loss='binary_crossentropy',
                optimizer=RMSprop(learning_rate=1e-4),
                metrics=['accuracy'])
  
  return model


from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_dir,  # This is the source directory for training images
        target_size=(150, 150),  # All images will be resized to 150x150
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary',
        shuffle= False)


EPOCHS = 20

model = create_model()

history = model.fit(
      train_generator,
      steps_per_epoch=100,  # 2000 images = batch_size * steps
      epochs=EPOCHS,
      validation_data=validation_generator,
      validation_steps=50,  # 1000 images = batch_size * steps
      verbose=2)

输出

AttributeError                            Traceback (most recent call last)
Cell In[15], line 8
      5 model = create_model()
      7 # Train the model
----> 8 history = model.fit(
      9       train_generator,
     10       steps_per_epoch=100,  # 2000 images = batch_size * steps
     11       epochs=EPOCHS,
     12       validation_data=validation_generator,
     13       validation_steps=50,  # 1000 images = batch_size * steps
     14       verbose=2)

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\utils\traceback_utils.py:122, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    119     filtered_tb = _process_traceback_frames(e.__traceback__)
    120     # To get the full stack trace, call:
    121     # `keras.config.disable_traceback_filtering()`
--> 122     raise e.with_traceback(filtered_tb) from None
    123 finally:
    124     del filtered_tb

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\backend\tensorflow\trainer.py:354, in TensorFlowTrainer.fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq)
    333         self._eval_epoch_iterator = TFEpochIterator(
    334             x=val_x,
    335             y=val_y,
...
    355     }
    356     epoch_logs.update(val_logs)
    358 callbacks.on_epoch_end(epoch, epoch_logs)

AttributeError: 'NoneType' object has no attribute 'items'
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

我尝试了以下调试步骤:

  1. 升级 Tensorflow 和 Keras
  2. 尝试使用更简单的神经网络,看看是否有相同的问题,但效果很好。
  3. 不是将validation_generator传递给model.fit(),而是使用numpy手动完成,但这也没有成功,因为对于它来说,训练数据的准确性和错误仅在偶数时期为0。

还检查了验证数据是否已正确加载。

Python版本:3.11.9 张量流版本:2.17.0 Keras 版本:3.4.1

python tensorflow machine-learning keras conv-neural-network
1个回答
0
投票

我已使用您指定的版本成功复制了您的代码 指定的,并且它正在运行。该模型正在使用定义的进行训练 纪元。看起来问题可能与图像数据生成器如何处理目录中的数据有关。请检查提供的路径并验证您的主目录是否包含子文件夹,每个子文件夹代表一个不同的类,与数据集中的类总数相匹配。

请参考这个要点

© www.soinside.com 2019 - 2024. All rights reserved.