我一直在尝试在 EfficientNet 中的二元分类任务中进行迁移学习。
我的目录结构如下所示:
training
├── label0
└── label1
validation
├── label0
└── label1
我用它来创建图像数据生成器:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
width_shift_range=0.0,
height_shift_range=0.0,
shear_range=0.0,
zoom_range=0.0,
horizontal_flip=True,
fill_mode='nearest'
)
no augmentations are applied
validation_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'training',
target_size=(240, 240),
batch_size=128,
class_mode='binary', # Binary classification
shuffle=True
)
validation_generator = validation_datagen.flow_from_directory(
'validation',
target_size=(240, 240),
batch_size=128,
class_mode='binary',
shuffle=True
)
输出为:
Found 19747 images belonging to 2 classes.
Found 4938 images belonging to 2 classes.
这是构建模型的代码:
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense
from tensorflow.keras.applications import EfficientNetB1
from tensorflow.keras.models import Model
from tensorflow.keras import layers
NUM_CLASSES = 2
IMG_SIZE = 240
size = (IMG_SIZE, IMG_SIZE)
def build_model():
inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
model = EfficientNetB1(include_top=False, input_tensor=inputs, weights="imagenet")
model.trainable = False
x = layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
x = layers.BatchNormalization()(x)
# x = layers.Dense(128, activation="relu")(x)
# top_dropout_rate = 0.2
# x = layers.Dropout(top_dropout_rate, name="dropout")(x)
# outputs = layers.Dense(1, activation="sigmoid", name="pred")(x)
outputs = layers.Dense(1, activation="sigmoid", name="pred")(x)
model = tf.keras.Model(inputs, outputs, name="EfficientNet")
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
model.compile(
optimizer=optimizer,
loss="binary_crossentropy",
metrics=["accuracy"]
)
return model
你可以看到我正在尝试很多的事情只是为了让某些事情发挥作用,但没有成功。我将
model.trainable
设置为 False,因为我正在遵循迁移学习的 this 实现。我还没有完成第一步。我本来打算在达到 60-70% 后继续解冻,但我什至无法达到 52%。
这是我开始训练的方式:
model = build_model()
epochs = 50
hist = model.fit(train_generator,
epochs=epochs,
steps_per_epoch=len(train_generator),
validation_data=validation_generator,
validation_steps=len(validation_generator))
并且在前 10 个 epoch 中,准确率和 val_accuracy 始终接近 50%,这只不过是随机猜测。
我尝试降低或增加学习率(1e-1 到 1e-6)、批量大小(32 - 256)、添加 dropout(0.1 - 0.5)、添加 relu 层(32 - 128),并确保图像都在正确的班级里。我该怎么做才能让模型真正学习?
在您的代码中您使用行:
model.trainable = False
此行意味着您的模型无法进行训练。如果你想做迁移学习,你必须解冻一些层。
一些如何解冻最后 2 层的示例,尝试使用不同的值进行实验:
# Unfreeze the top layers of the base model
for layer in base_model.layers[-2:]:
if not isinstance(layer, layers.BatchNormalization):
layer.trainable = True
else:
layer.trainable = False
BatchNormalization 层被冻结,因为对于小数据集,它可能会导致收敛问题(yhis 层中的值将不稳定。)