添加卷积层冻结训练

问题描述 投票:0回答:0

我正在尝试训练图像分类模型,有六个类别,训练数据集大约有七千张图像。我的问题是简单模型表现良好,可以达到 85% 的准确度,但更深层次的模型冻结并且不能超过 5% 的准确度。我能够增加密集层的大小,并增加卷积层中的内核数量,这两者都会提高准确性,但是一旦我添加第二个卷积层,模型就会卡在损失值正好为1.7918。如果它在 1.9 和 1.6 之间变化,我会怀疑是局部最小值,但它在多个时期都保持在 1.7918。

下面是简单模型的代码。

import tensorflow as tf
import pathlib
import os
import matplotlib.pyplot as plt
import keras
from tensorflow.keras import utils
from tensorflow.keras import layers
import pickle
from tensorflow.keras import callbacks
from keras.optimizers import gradient_descent_v2

checkpoint_dir = "./training_checkpoints"
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

callbacks = [
    tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix, save_weights_only=True),
    tf.keras.callbacks.TensorBoard(log_dir='./logs')
]

gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

strategy = tf.distribute.MirroredStrategy()

my_dir = "/workspace/"

train_dir = os.path.join(my_dir, 'data/train')
trainT = utils.image_dataset_from_directory(train_dir, labels="inferred", label_mode="categorical", class_names=None, color_mode="grayscale", image_size=(802,393), shuffle=True, seed=2023, batch_size=32)
trainT = trainT.apply(tf.data.experimental.ignore_errors())

test_dir = os.path.join(my_dir, 'data/test')
testT = utils.image_dataset_from_directory(test_dir, labels="inferred", label_mode="categorical", class_names=None, color_mode="grayscale", image_size=(802,393), shuffle=True, seed=2023, batch_size=32)

evalu_dir = os.path.join(my_dir, 'data/evaluation')
evaluT = utils.image_dataset_from_directory(evalu_dir, labels="inferred", label_mode="categorical", class_names=None, color_mode="grayscale", image_size=(802,393), shuffle=True, seed=2023, batch_size=32)

with strategy.scope():
    model = keras.Sequential(
        [
            layers.Rescaling(1./255),
            layers.Conv2D(32, 32, activation='relu', padding="valid", data_format='channels_last'),
            layers.MaxPooling2D((13,13), strides=1),
            layers.Flatten(),
            layers.Dense(25, activation='relu', use_bias=True, kernel_initializer='glorot_uniform'),
            layers.Dense(6, activation='relu', use_bias=True, kernel_initializer='glorot_uniform'),
            layers.Activation('softmax')
        ]
    )

    opt = gradient_descent_v2.SGD(lr=0.01)

    model.compile(optimizer=opt,
                 loss=tf.keras.losses.CategoricalCrossentropy(),
                 metrics=[tf.keras.metrics.CategoricalAccuracy(),
                         tf.keras.metrics.MeanIoU(num_classes=6)])

    history = model.fit(x=trainT, epochs=20, validation_data=evaluT, callbacks=callbacks)

model.save("fitModel")

with open('/trainHistory', 'wb') as file_pi:
    pickle.dump(history.history, file_pi)

下面是我为稍深的模型更改的代码,我没有包含的任何内容都与简单模型相同。

    model = keras.Sequential(
        [
            layers.Rescaling(1./255),
            layers.Conv2D(32, 32, activation='relu', padding="valid", data_format='channels_last'),
            layers.MaxPooling2D((13,13), strides=1),
            layers.Conv2D(32, 7, activation='relu', padding="valid", data_format='channels_last'),
            layers.MaxPooling2D((4,4), strides=1),
            layers.Flatten(),
            layers.Dense(25, activation='relu', use_bias=True, kernel_initializer='glorot_uniform'),
            layers.Dense(6, activation='relu', use_bias=True, kernel_initializer='glorot_uniform'),
            layers.Activation('softmax')
        ]
    )

    opt = tf.keras.optimizers.SGD(learning_rate=0.1)

提高学习率是为了摆脱局部最小值,但显然这没有用。

我试过改变学习率,我试过使用 adam 优化器,我试过移除两个卷积层之间的池化层,我试过在第二个卷积层之后移除池化层。

感谢您提供的任何帮助。

python tensorflow keras deep-learning conv-neural-network
© www.soinside.com 2019 - 2024. All rights reserved.