我有一个数据集,其中一个文件夹包含图像,另一个文件夹包含相应的文本文件。每个文本文件都包含相应类别的标签。
Images folder
image_0000.jpeg
image_0001.jpeg
Label folder
image_0000.txt
image_0001.txt
标签文本文件包含 0 或 1 或 2 的值。
我想将对应于标签 0 的图像保存在另一个文件夹中。与剩余的标签 1,2 类似
如下图所示。
def read_image_list(image_list_file):
f = open(image_list_file, 'r')
filenames = []
for line in f:
filename, label = line[:-1].split(' ')
filenames.append(filename)
return filenames
导入必要的库,如os、shutil、PIL等
定义将读取和保存图像的源目录和目标目录。
遍历源码目录下的每个文件,使用PIL库检查是否是图片文件。
如果文件是图片,通过将扩展名从“.jpg”更改为“.txt”来读取其对应的文本文件。
从文本文件中提取标签,并在目标目录中创建一个带有该标签的新文件夹。
使用 shutil 将图像文件复制到新创建的文件夹中。
对源目录下的每个图片文件重复以上步骤
下面是实现上述步骤的代码:
import os
import shutil
from PIL import Image
# Define the source directory and destination directory
src_dir = "path/to/source/directory"
dst_dir = "path/to/destination/directory"
# Iterate through each file in the source directory
for filename in os.listdir(src_dir):
filepath = os.path.join(src_dir, filename)
# Check if the file is an image
if filename.endswith(".jpg"):
# Read the corresponding text file
text_filename = os.path.splitext(filename)[0] + ".txt"
text_filepath = os.path.join(src_dir, text_filename)
with open(text_filepath, "r") as f:
label = f.readline().strip()
# Create a new folder with the label in the destination directory
label_dir = os.path.join(dst_dir, label)
if not os.path.exists(label_dir):
os.makedirs(label_dir)
# Copy the image file to the newly created folder
dst_filepath = os.path.join(label_dir, filename)
shutil.copyfile(filepath, dst_filepath)
只需更改前三个目录变量即可使代码运行起来很有魅力。 Prem J的回答也很酷
import os
import shutil
# assumung that the names of the images and the labels are same excepth the extension
image_dir = "f1/im" # Images directory path
label_dir = "f1/la" # Labels directory path
final_dir = "f1/dataset" # the dataset directory where you want to save the images as label folders
images = os.listdir(image_dir) # list of all the images names
label = os.listdir(label_dir) # list of all the label names
# Sorting it to make sure the images and the labels are on the same index
images = sorted(images) # ['test 1.jpg', 'test 2.jpg', 'test 3.jpg', 'test 4.jpg', 'test 5.jpg']
label = sorted(label) # ['test 1.txt', 'test 2.txt', 'test 3.txt', 'test 4.txt', 'test 5.txt']
if not os.path.exists(final_dir):
os.mkdir(final_dir)
for c, txt in enumerate(label):
txt_path = os.path.join(label_dir, txt)
img_path = os.path.join(image_dir, images[c])
with open(txt_path, "r") as r:
l = r.read()
dst = f"{final_dir}/{l}"
if not os.path.exists(dst):
os.mkdir(dst)
shutil.copy(img_path, dst)
shutil.copy(txt_path, dst)