如何修改我的代码来处理 RGBX(4 通道)图像以进行语义分割?

问题描述 投票:0回答:1

我是这个领域的新手,一直在关注使用 3 通道 RGB 图像进行语义分割的 U-Net 教程https://www.youtube.com/watch?v=68HR_eyzk00&list=PLZsOBAyNTZwbR08R959iCvYT3qzhxvGOE&index=2&ab_channel=DigitalSreeni,这对我来说效果很好。但是,我现在需要扩展管道以支持 4 通道 RGBX 图像(即 RGB + 一个额外通道),但我不确定如何修改代码以适应额外通道,特别是对于预处理和 ImageDataGenerator部分(我认为 ImageDataGenerator 不支持 4 通道图像)。

这是代码(将图像修补为(256 * 256 * 4)并将蒙版修补为(256 * 256)之后):

import os
import cv2
import numpy as np
import glob
from matplotlib import pyplot as plt
from patchify import patchify
import tensorflow as tf
import splitfolders
import segmentation_models as sm
from tensorflow.keras.metrics import MeanIoU
from sklearn.preprocessing import MinMaxScaler
from keras.utils import to_categorical


input_folder='path folder to my images and masks '
output_folder='path to output folder'
#split with a ratio
splitfolders.ratio(input_folder, output=output_folder, seed=42, ratio=(.75,.25),group_prefix=None) 

#Rearange the folder structure for keras augmentation


seed=24
batch_size=16 
n_classes=2 


scaler=MinMaxScaler()


BACKBONE='resnet34'  
preprocess_input=sm.get_preprocessing(BACKBONE)

def preprocess_data(img, mask, num_class):
    #Scale images
    img=scaler.fit_transform(img.reshape(-1, img.shape[-1])).reshape(img.shape)
    img=preprocess_input(img)  #Preprocess based on the pretrained backbone
    mask=to_categorical(mask, num_class)
    return (img,mask)

from tensorflow.keras.preprocessing.image import ImageDataGenerator
def trainGenerator(train_img_path, train_mask_path, num_class):
    img_data_gen_args=dict(horizontal_flip=True, vertical_flip=True, fill_mode='reflect') #Data augmentation
    
    image_datagen=ImageDataGenerator(**img_data_gen_args)
    mask_datagen=ImageDataGenerator(**img_data_gen_args)
    
    image_generator=image_datagen.flow_from_directory(train_img_path, class_mode=None, batch_size=batch_size, seed=seed)
    mask_generator=image_datagen.flow_from_directory(train_mask_path, class_mode=None, color_mode='grayscale', batch_size=batch_size, seed=seed)
    
    train_generator=zip(image_generator, mask_generator)
    
    for (img, mask) in train_generator:
        img, mask= preprocess_data(img, mask, num_class)
        yield (img, mask)

train_img_path='path for training images'
train_mask_path='path for training masks'
train_img_gen=trainGenerator(train_img_path, train_mask_path, num_class=2)

val_img_path='path for validation images'
val_mask_path='path for validation masks'
val_img_gen=trainGenerator(val_img_path, val_mask_path, num_class=2)


x, y=train_img_gen.__next__()

for i in range(0,3):
    image=x[i]
    mask=np.argmax(y[i], axis=2)
    plt.subplot(1,2,1)
    plt.imshow(image)
    plt.subplot(1,2,2)
    plt.imshow(mask, cmap='gray')
    plt.show()


num_train_imgs=len(os.listdir('path for training images'))
num_val_images=len(os.listdir('path for validation image'))
steps_per_epochs=num_train_imgs//batch_size
val_steps_per_epoch=num_val_images//batch_size

IMG_HEIGHT=x.shape[1]
IMG_WIDTH=x.shape[2]
IMG_CHANNELS=x.shape[3]

n_classes=2

model=sm.Unet('resnet34', encoder_weights='None', input_shape=(IMG_HEIGHT,IMG_WIDTH,IMG_CHANNELS), classes=n_classes,activation='softmax')
model.compile('Adam', loss=sm.losses.binary_crossentropy, metrics=[sm.metrics.iou_score, sm.metrics.FScore()])

history=model.fit(train_img_gen, steps_per_epoch=steps_per_epochs, epochs=100, verbose=1, validation_data=val_img_gen, validation_steps=val_steps_per_epoch)


python machine-learning keras deep-learning semantic-segmentation
1个回答
0
投票

您可以删除第四个数据带(通常是 Alpha 通道),同时使用 OpenCV 读取数据,如下所示。

import cv2

img = cv2.imread(filename)

如果工作流程需要图像路径而不是

numpy
对象,那么我可能会运行一个预处理工作流程,将 3 通道图像复制到新目录
train_img_path

© www.soinside.com 2019 - 2024. All rights reserved.