我正在尝试对屋顶瓦片进行 16 类分类。我的原始训练数据集总共包含 6800 张图像。我使用离线增强来为每个类生成额外的 750 张图像。
测试数据集是在拍摄屋顶瓦片图像的单独会话中收集的,总共包含 1100 张图像。我在这里也使用了离线增强,并从该数据集中为每个类生成了额外的 450 张图像。
该模型在训练数据集上的准确率为 93%,在验证集上的准确率为 84.2%。然而,我在测试数据集上获得的最大准确度是 60%。
我尝试在模型中添加 dropout 层、更改批量大小等,但测试集的准确率没有超过 60%。我错过了什么?
import pathlib
data_dir = pathlib.Path("C:/Users/../train_750Aug")
train_data_dir = data_dir
validation_data_dir = data_dir
test_data_dir = pathlib.Path("C:/Users/../Testset_cubed_aug450")
nr_of_epochs = 20
saved_model_name = 'C:/Users/../models/750augTrain_Aug450Test.h5'
batch_size = 32
img_height = 224
img_width = 224
datagen=ImageDataGenerator(rescale= 1. /255,validation_split=0.25,width_shift_range=[-10,10],horizontal_flip=True
,brightness_range=[0.5,1.5],vertical_flip=True)
train_generator=datagen.flow_from_directory(data_dir,
target_size=(img_width,img_height),
batch_size=batch_size,
subset="training",
class_mode='categorical'
)
validation_generator=datagen.flow_from_directory(data_dir,
target_size=(img_width,img_height),
batch_size=batch_size,
subset="validation",
class_mode='categorical'
)
test_datagen = ImageDataGenerator(rescale= 1. /255)
test_generator = test_datagen.flow_from_directory(test_data_dir,
target_size = (img_width,img_height),
batch_size = batch_size,
class_mode = 'categorical')
def define_VGGmodel():
# load model
model = VGG16(include_top=False, input_shape=(img_width, img_height,3))
# mark loaded layers as not trainable
list_trainable = ['block5_conv3']
for layer in model.layers:
if(layer.name in list_trainable):
layer.trainable = True
print(layer.name)
else:
layer.trainable = False
## make layers trainable
# add new classifier layers
flat1 = Flatten()(model.layers[-1].output)
class1 = Dense(128, activation='relu', kernel_initializer='he_uniform')(flat1)
droppyDrop = Dropout(0.2)(class1)
output = Dense(16, activation='softmax')(droppyDrop)
# define new model
model = Model(inputs=model.inputs, outputs=output)
# compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['categorical_accuracy'])
return model
model1 = define_VGGmodel()
from tensorflow.keras.callbacks import History
history=History()
history = model1.fit(train_generator,steps_per_epoch=len(train_generator),epochs=nr_of_epochs,validation_data=validation_generator,shuffle=True, verbose=2)
model1.save(saved_model_name)
我尝试对训练和测试数据集进行离线增强,包括更改批量大小、添加 dropout 层,但测试集的准确性没有提高并保持在 60%
您的分类器可能过于受限。你有
Dense(128) -> ReLu
DropOut(0.2)
Dense(16) -> SoftMax
而 VGG16 与
include_top=True
有
Dense(4096) -> ReLu
Dense(4096) -> ReLu
Dense(16) -> SoftMax
我建议在你的分类器中投入更多的权重和层数,看看它会带你做什么。