我正在尝试使用 tf.data 来扩充我拥有的数据集。数据集在我的计算机本地排列如下:
datasets/fruits/{class_name}/*jpg
{class_name}包括7种不同的水果,包括:草莓、芒果、西兰花、葡萄、苹果、柠檬和橙子。
这是我为数据增强步骤编写的代码。正如您所看到的,我什至没有实际增加数据,我只是使用 tf.data.Dataset.from_tensor_slices 加载图像并重新缩放像素:
import tensorflow as tf
import random
from tensorflow.data import AUTOTUNE
from tensforflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers.experimental import preprocessing
from imutils.import paths
from sklearn.preprocessing import LabelEncoder
INIT_LR = 1e-2 # learning rate
BS = 32 # batch size
EPOCHS = 50 # number of epochs
# load images with tensorflow
def load_images(imagePath, label):
image = tf.io.read_file(imagePath)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
image = tf.image.resize(image, (64, 64))
return (image, label)
# augment helper function
def augment(image, label, aug):
image = aug(image)
return (image, label)
# get all image paths and save them as strings with format **/{class_name}/*jpg
allImages = list(paths.list_images("datasets/fruits"))
random.shuffle(allImages) # shuffle the images
# perform 0.75/0.25 train/test split
i = int(len(allImages) * 0.25)
trainPaths = allImages[i:]
# get labels by getting {class_name} from **/{class_name}/*jpg
trainLabels = [p.split(os.path.sep)[-2] for p in trainPaths]
testPaths = allImages[:i]
testLabels = [p.split(os.path.sep)[-2] for p in testPaths]
# use LabelEncoder to one-hot encode the class names
labelEncoder = LabelEncoder()
labelEncoder = labelEncoder.fit(trainLabels)
trainLabels = labelEncoder.transform(trainLabels)
trainLabels = to_categorical(trainLabels)
testLabels = labelEncoder.transform(testLabels)
testLabels = to_categorical(testLabels)
# load the train and test data into a tf.data.Dataset
trainDS = tf.data.Dataset.from_tensor_slices((trainPaths, trainLabels))
trainDS = (
trainDS
.shuffle(32, seed=42)
.map(load_images, num_parallel_calls=AUTOTUNE)
.batch(BS)
.cache()
)
# rescale the pixels from [0, 1]
trainAug = tf.keras.Sequential(
[
preprocessing.Rescaling(scale=1.0/255),
]
)
trainDS = (
trainDS
.map(lambda x, y: augment(x, y, trainAug), num_parallel_calls=AUTOTUNE)
.prefetch(AUTOTUNE)
)
testDS = tf.data.Dataset.from_tensor_slices(( testPaths, testLabels ))
testDS = (
testDS
.shuffle(32)
.map(load_images, num_parallel_calls=AUTOTUNE)
.batch(BS)
.cache()
)
testAug = tf.keras.Sequential(
[
preprocessing.Rescaling(scale=1.0/255),
]
)
testDS = (
testDS
.map(lambda x, y: augment(x, y, testAug), num_parallel_calls=AUTOTUNE)
.prefetch(AUTOTUNE)
)
# I don't think there is any issues with this part, but here I am setting up the optimizer and model for training
sgd = SGD(learning_rate=INIT_LR, momentum=0.9, weight_decay=INIT_LR/EPOCHS)
model = MiniVGGNet.build(64, 64, 3, num_classes=num_classes)
model.compile(loss="categorical_crossentropy", optimizer=sgd, metrics=["accuracy"])
training_history = model.fit(
x=trainDS,
validation_data=testDS,
epochs=EPOCHS
)
这是我得到的训练结果。您可以看到 val_loss 和 val_accuracy 没有显示出改善的迹象。
我怀疑问题在于图像数据的加载和处理方式,而不是我正在使用的模型。我能够使用相同的模型生成合理的结果,对加载的图像进行训练,而不是使用张量流方法,而是使用简单的 python 方法。这些是我得到的结果
无论我如何加载图像数据,我都希望获得类似的训练结果。如果有人能阐明我的问题,我将不胜感激!我已经被这个问题困扰好几天了。
尝试使用 tf.data 加载数据集并进行训练,但得到了奇怪的结果
结果我将像素重新缩放了两次。
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
自动重新缩放一次,然后,在预处理步骤中,我再次重新缩放
trainAug = tf.keras.Sequential(
[
preprocessing.Rescaling(scale=1.0/255),
]
)
删除
preprocessing.Rescaling(scale=1.0/255),
后,训练开始产生合理的结果。
文档提到了这种自动重新缩放行为(https://www.tensorflow.org/api_docs/python/tf/image/convert_image_dtype)