在keras cnn中加载几张没有标签的图像

问题描述 投票:0回答:3

我有几个具有不同名称的 .jpeg 图像,我想将它们加载到 jupyter 笔记本中的 cnn 中以对它们进行分类。我发现的唯一方法是:

test_image = image.load_img("name_of_picute.jpeg",target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
result = cnn.predict(test_image)

在 Keras API 中找到的所有其他内容(如

tf.keras.preprocessing.image_dataset_from_directory()
)似乎只适用于标记数据。遗憾的是,我无法“简单地”迭代图片的名称,因为它们的命名不同,有没有办法一次性预测所有图片而不命名每张图片?

感谢您的帮助,

尼克

python-3.x tensorflow keras jupyter-notebook conv-neural-network
3个回答
3
投票

解决方案

tf.keras.preprocessing.image_dataset_from_directory
可以更新为返回数据集和image_path,如此处所述 -> https://stackoverflow.com/a/63725072/4994352


0
投票

有多种方法,对于较大的数据,使用

tf.data.DataSet
很有用,因为它可以很容易地调整性能。我会给你非性能优化的代码。将
<YOUR PATH INCL. REGEX>
替换为
../input/pokemon-images-and-types/images/*/*
等路径。

import tensorflow as tf
from tensorflow.data.experimental import AUTOTUNE


def load(file_path):
    img = tf.io.read_file(file_path)
    img = tf.image.decode_jpeg(img, channels=3)
    
    ... # do some preprocessing like resizing if necessary

    return img


list_ds = tf.data.Dataset.list_files(str('<YOUR PATH INCL. REGEX>'), shuffle=True)  # Get all images from subfolders
train_dataset = list_ds.take(-1)
# Set `num_parallel_calls` so multiple images are loaded/processed in parallel.
train_dataset = train_dataset.map(load, num_parallel_calls=AUTOTUNE)

0
投票

将图像文件存储在子目录中,如下所示:

  train_dataset
     |
     |--class_0
        | 
        |- <images>

现在,您可以使用 ImageDataGenerator 加载目录的图像(在我们的例子中,它来自“train_dataset/class_0”)

from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    rescale=1./255,  # Normalize the images
    rotation_range=20,  # Random rotations
    width_shift_range=0.2,  # Horizontal shifts
    height_shift_range=0.2,  # Vertical shifts
    shear_range=0.2,  # Shear transformations
    zoom_range=0.2,  # Random zoom
    horizontal_flip=True,  # Horizontal flips
    fill_mode='nearest',  # Filling strategy for new pixels
    validation_split=0.2
)

train_generator = datagen.flow_from_directory(
    image_directory,
    target_size=(64, 64),  # Adjust based on your needs
    batch_size=8,
    class_mode=None,  # Unsupervised learning
    shuffle=True,
    subset='training'  # Set as training data
)

validation_generator = datagen.flow_from_directory(
    image_directory,
    target_size=(64, 64),  # Adjust based on your needs
    batch_size=8,
    class_mode=None,  # Unsupervised learning
    shuffle=True,
    subset='validation'  # Set as validation data
)

这里

image_director
"train_dataset/" i.e parent directory Note that 
class_model = None`,指定我们不需要标签

获取批次:

for batch in train_generator:
    print(batch)
    break
© www.soinside.com 2019 - 2024. All rights reserved.