我目前正在致力于在公开的 UT 交互数据集中提取图像中人物的二进制轮廓。我在图像上应用了 felzenszwalb 的分割,根据强度和空间位置将图像划分为多个片段。然后我必须以某种方式提取图像中人物的二元轮廓。问题是我无法正确提取图像中人物的二进制轮廓。这是原图:
这是应用 felzenszwalb 分割后的分割图像,参数值为scale = 200和sigma = 0.8,其中图像中的每种颜色代表一个片段。
这是应用一些阈值提取二值轮廓后的二值图像,其中我输入了一些值来排除强度小于阈值且高度、宽度和面积高于阈值的片段:
最后,这是使用迭代 1 进行形态侵蚀后的二值图像:
现在正如你所看到的,这些片段也成为了二元剪影的一部分,不属于两个人中的任何一个。即使在形态侵蚀操作之后,这些片段也没有被正确去除。我尝试了分割算法的不同迭代和不同参数值的形态侵蚀操作,其中它可以优先创建大片段,忽略像素之间的微小变化,或者创建小片段,根据参数值优先考虑最小的变化。分割和二进制轮廓提取对于某些图像效果很好,但我必须更改分割算法中的参数、二进制轮廓提取中的阈值以及许多图像的形态侵蚀迭代。我想自动化这个过程,但我只是找不到一些适用于数据集中所有图像的方法。我能做什么?
这是代码:
import numpy as np
import cv2
import skimage.segmentation as seg
from skimage.color import rgb2gray
from skimage.measure import regionprops
def morphological_erosion(binary_silhouette):
# Create a structuring element (consider different shapes)
kernel = np.ones((3, 3), np.uint8) # Basic 3x3 square
binary_silhouette = cv2.erode(binary_silhouette, kernel, iterations= 1)
return binary_silhouette
def felzenszwalb_segmentation(rgb_image):
# Perform segmentation using Felzenszwalb's method
segments = seg.felzenszwalb(rgb_image, scale=200, sigma=0.8, min_size = 50)
# Generate random colors for each segment
num_segments = len(np.unique(segments))
colors = np.random.randint(0, 256, size=(num_segments, 3), dtype=np.uint8)
# Create a blank RGB image of the same size as the input image
segmented_image = np.zeros_like(rgb_image)
# Assign random colors to each segment
for segment_id in np.unique(segments):
mask = (segments == segment_id)
segmented_image[mask] = colors[segment_id]
# display the segmented image
cv2.imshow("segmented Image", segmented_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
return segments
def create_binary_silhoutte(image_segments, image):
# Convert the image to grayscale
gray_image = rgb2gray(image)
# Create an empty binary image
binary_image = np.zeros_like(gray_image)
# Calculate the mean intensity of each segment
for region in regionprops(image_segments, intensity_image=gray_image):
# Get the mean intensity of the segment
mean_intensity = region.mean_intensity
area = region.area
# Get bounding box coordinates
min_row, min_col, max_row, max_col = region.bbox
# Calculate height and width
height = max_row - min_row
width = max_col - min_col
# If the mean intensity is greater than the threshold, mark the region in the binary image
if mean_intensity > 0.5 and area < 50000 and height < 500 and width < 500:
for coordinates in region.coords:
binary_image[coordinates[0], coordinates[1]] = 255
return binary_image
if __name__ == "__main__":
# # Example usage
image_path = # path to image
image = cv2.imread(image_path)
# Apply Felzenszwalb segmentation to get the silhouette
segments = felzenszwalb_segmentation(image)
binary_silhouette = create_binary_silhoutte(segments, image)
cv2.imshow("binary Silhouette", binary_silhouette)
cv2.waitKey(0)
cv2.destroyAllWindows()
binary_silhouette = morphological_erosion(binary_silhouette)
cv2.imshow("binary Silhouette", binary_silhouette)
cv2.waitKey(0)
cv2.destroyAllWindows()
在应用分割之前,我还尝试使用一些噪声去除算法,例如双边滤波器,但它不起作用。任何帮助将不胜感激。
我使用了 Felzenszwalb 的分割,参数值为
scale=650, sigma=3, min_size=1000
,这将给出以下结果:
然后您可以迭代分段标签并使用形态学开放和清除边框来提取轮廓。
这是完整的代码:
import cv2
import matplotlib.pyplot as plt
import numpy as np
from skimage import segmentation, morphology
image_path = # path to image
image = cv2.imread(image_path)
# Apply Felzenszwalb segmentation to get the silhouette
segments_fz = segmentation.felzenszwalb(image, scale=650, sigma=3, min_size=1000)
# Create an empty binary image
binary_silhouettes = np.zeros_like(segments_fz)
for l in np.unique(segments_fz):
# Disconnect objects
silhouette = morphology.opening(segments_fz==l, morphology.diamond(3))
# Clear objects connected to the image border
silhouette = segmentation.clear_border(silhouette)
# Remove small objects, i.e. objects with area<1000 will be removed
silhouette = morphology.area_opening(silhouette, 1000)
# Aggregate silhouette masks
binary_silhouettes += silhouette
plt.imshow(binary_silhouettes)
最终结果是这样的:
或者如果我们将蒙版应用于原始图像,如下所示:
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
mask_3d = np.repeat(binary_silhouettes[:,:,np.newaxis], 3, axis=-1)
plt.imshow(mask_3d * image_rgb)
我们将有: