EfficientNet 中用于二元分类任务的迁移学习停留在 50%

问题描述 投票:0回答:1

我一直在尝试在 EfficientNet 中的二元分类任务中进行迁移学习。

我的目录结构如下所示:

training
├── label0
└── label1

validation
├── label0
└── label1

我用它来创建图像数据生成器:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=30,           
    width_shift_range=0.0,       
    height_shift_range=0.0,
    shear_range=0.0,            
    zoom_range=0.0,              
    horizontal_flip=True,
    fill_mode='nearest'
)
no augmentations are applied
validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    'training',
    target_size=(240, 240),     
    batch_size=128,               
    class_mode='binary',          # Binary classification
    shuffle=True
)

validation_generator = validation_datagen.flow_from_directory(
    'validation', 
    target_size=(240, 240),
    batch_size=128,
    class_mode='binary',
    shuffle=True
)

输出为:

Found 19747 images belonging to 2 classes.
Found 4938 images belonging to 2 classes.

这是构建模型的代码:

from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.applications import EfficientNetB1
from tensorflow.keras.models import Model
from tensorflow.keras import layers

NUM_CLASSES = 2
IMG_SIZE = 240
size = (IMG_SIZE, IMG_SIZE)

def build_model():
    inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
    
    model = EfficientNetB1(include_top=False, input_tensor=inputs, weights="imagenet")

    model.trainable = False
    
    x = layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
    x = layers.BatchNormalization()(x)
    # x = layers.Dense(128, activation="relu")(x)
    # top_dropout_rate = 0.2
    # x = layers.Dropout(top_dropout_rate, name="dropout")(x)
    # outputs = layers.Dense(1, activation="sigmoid", name="pred")(x)
    outputs = layers.Dense(1, activation="sigmoid", name="pred")(x)
    
    model = tf.keras.Model(inputs, outputs, name="EfficientNet")
    optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
    model.compile(
        optimizer=optimizer,
        loss="binary_crossentropy",
        metrics=["accuracy"]
    )
    
    return model

你可以看到我正在尝试很多的事情只是为了让某些事情发挥作用,但没有成功。我将

model.trainable
设置为 False,因为我正在遵循迁移学习的 this 实现。我还没有完成第一步。我本来打算在达到 60-70% 后继续解冻,但我什至无法达到 52%。

这是我开始训练的方式:

model = build_model()
epochs = 50
hist = model.fit(train_generator, 
                 epochs=epochs, 
                 steps_per_epoch=len(train_generator), 
                 validation_data=validation_generator,
                 validation_steps=len(validation_generator))

并且在前 10 个 epoch 中,准确率和 val_accuracy 始终接近 50%,这只不过是随机猜测。

我尝试降低或增加学习率(1e-1 到 1e-6)、批量大小(32 - 256)、添加 dropout(0.1 - 0.5)、添加 relu 层(32 - 128),并确保图像都在正确的班级里。我该怎么做才能让模型真正学习?

tensorflow machine-learning keras transfer-learning efficientnet
1个回答
0
投票

在您的代码中您使用行:

model.trainable = False

此行意味着您的模型无法进行训练。如果你想做迁移学习,你必须解冻一些层。

一些如何解冻最后 2 层的示例,尝试使用不同的值进行实验:

# Unfreeze the top layers of the base model
for layer in base_model.layers[-2:]:  
    if not isinstance(layer, layers.BatchNormalization):
        layer.trainable = True
    else:
        layer.trainable = False  

BatchNormalization 层被冻结,因为对于小数据集,它可能会导致收敛问题(yhis 层中的值将不稳定。)

© www.soinside.com 2019 - 2024. All rights reserved.