VGG16 和 VGG19 在训练期间没有进行任何学习,尽管 AlexNet 表现良好?

问题描述 投票:0回答:1

我正在复制 ML 项目研究论文的结果。该论文是关于使用 CNN 进行手掌静脉识别的。它在不同的手掌静脉数据集上训练 3 个 CNN,其中之一是 FYODB 数据集。

我使用 Keras 从头开始训练我的模型,尽管 AlexNet 表现良好,测试准确率超过 95%,但由于某种原因,VGG16 和 VGG19 都无法在训练期间进行任何学习。他们在每个时期的准确率连 0.1 都达不到。

我将分享我用来构建和训练模型的代码 - 请注意,我正在复制的论文有意减少了每个 CONV2D 层的滤波器数量,以减少训练时间(尽管我也尝试过使用原始架构) ,相同的结果)。

一些关键常数:

  • 班级人数:160人
  • 数据集中样本数:6400
  • 火车-瓦尔分裂:80-20
  • 训练-测试比例:80-20
  • 训练、测试、验证图像总数:4096、1024、1280
  • 图像形状:(224, 224, 3)

这是我用于构建和训练的代码。我也尝试过使用预训练的 VGG16 权重进行迁移学习,实际上效果很好,测试准确率超过 95%。

data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.1),
        layers.RandomContrast(0.1),
        layers.RandomTranslation(0.1, 0.1),
        layers.RandomHeight(0.1),
        layers.RandomWidth(0.1),
    ]
)

def make_vgg16_model(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)

    # Block 1
    x = data_augmentation(inputs)
    x = layers.Rescaling(1.0 / 255)(inputs)
    x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
    x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 2
    x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(64, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 3
    x = layers.Conv2D(96, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(96, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(96, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 4
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Block 5
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.Conv2D(128, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)


    # Flatten and Fully Connected Layers
    x = layers.Flatten()(x)
    x = layers.Dense(4096, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    x = layers.Dense(4096, activation='relu')(x)
    x = layers.Dropout(0.5)(x)
    outputs = layers.Dense(num_classes, activation='softmax')(x)

    return keras.Model(inputs, outputs)
from tqdm import tqdm
num_epochs = 30

models = {
    "AlexNet": make_alexnet_model(input_shape=image_size, num_classes=num_classes),
    "VGG16": make_vgg16_model(input_shape=image_size, num_classes=num_classes),
    "VGG19": make_vgg19_model(input_shape=image_size, num_classes=num_classes),
}

model_histories = {}

for name, model in models.items():
    print(f'\x1b[34mTraining {name} Model...\x1b[0m')
    model.compile(
        optimizer=keras.optimizers.Adam(1e-3),
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
    start = time.time()
        
    # Wrap model.fit with tqdm for a progress bar
    progress_bar = tqdm(total=num_epochs, position=0, leave=True)
    history = model.fit(
        train_dataset,
        epochs=num_epochs,
        validation_data=val_dataset,
        verbose=1,
        callbacks=[
            tf.keras.callbacks.LambdaCallback(on_epoch_end=lambda epoch, logs: progress_bar.update(1)),
        ]
    )
    progress_bar.close()
    
    model_histories[name] = history
    
    end = time.time()
    print(f'Finished training {name} in {end-start:.2f}s\n')

输出样本:

Epoch 14/30
128/128 [==============================] - ETA: 0s - loss: 5.0713 - accuracy: 0.0054
 47%|████▋     | 14/30 [05:35<06:16, 23.50s/it]
128/128 [==============================] - 23s 182ms/step - loss: 5.0713 - accuracy: 0.0054 - val_loss: 5.1037 - val_accuracy: 0.0023
Epoch 15/30
128/128 [==============================] - ETA: 0s - loss: 5.0709 - accuracy: 0.0081
 50%|█████     | 15/30 [05:58<05:53, 23.55s/it]
python keras deep-learning conv-neural-network vgg-net
1个回答
0
投票

你的代码中唯一让我震惊的是这一点:

    # Block 1
    x = data_augmentation(inputs)  # This is not being used
    x = layers.Rescaling(1.0 / 255)(inputs)  # This is not being used
    x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
    x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2))(x)

看起来它通过重新分配

x
来丢弃前两层,而不在后续层中使用它?如果这是正确的,那么缺乏数据增强可以解释为什么训练这个网络很困难。

© www.soinside.com 2019 - 2024. All rights reserved.