我是 OpenCV 的初学者。我正在尝试将视差图转换为 3D 点云,但是 3D 点云的输出看起来一点也不像 2D 图像的 3D 渲染。我不确定 OpenCV 提供的技术是否符合预期。我希望找到一些帮助或建议来缓解这个问题。
我创建了一个视差图,如下:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
from mpl_toolkits.mplot3d import Axes3D
# Load the left and right images in grayscale
left_image = cv2.imread('Adirondack-perfect/im0.png.', 0) # Change 'left_image.jpg' to your left image path
right_image = cv2.imread('Adirondack-perfect/im1.png', 0) # Change 'right_image.jpg' to your right image path
# Initialize the stereo block matching object
stereo = cv2.StereoSGBM_create(
minDisparity=0,
numDisparities = 64, # Ensure it's a multiple of 16
blockSize=5, # Decrease for more detail
P1=8 * 5**2, # Consider lowering for less smoothness
P2=32 * 5**2, # Consider lowering for less smoothness
disp12MaxDiff=10, # Non-zero for left-right consistency check
uniquenessRatio=10, # Increase for more reliable matches
speckleWindowSize=100, # Increase to filter out noise
speckleRange=32, # Increase to filter out noise
preFilterCap=63,
mode=cv2.STEREO_SGBM_MODE_SGBM_3WAY
)
# Compute the disparity map
disparity_map = stereo.compute(left_image, right_image)
cv2.imwrite('disparity_map.png', disparity_map)
img = disparity_map.copy()
plt.imshow(img, 'CMRmap_r')
这个输出看起来很棒:
现在我尝试使用以下代码将其转换为 3D 点云:
# Intrinsic parameters of the camera
focal_length = 4161.221 # Assuming the focal length is the same in x and y directions
cx = 1445.577 # The x-coordinate of the principal point
cy = 984.686 # The y-coordinate of the principal point
baseline = 176.252 # The distance between the two camera centers
# Creating the Q matrix for reprojecting
Q = np.float32([
[1, 0, 0, -cx],
[0, 1, 0, -cy],
[0, 0, 0, focal_length],
[0, 0, -1/baseline, 0]
])
# Reproject the points to 3D
points_3D = cv2.reprojectImageTo3D(disparity_map, Q)
# Reshape the points to a 2D array where each row is a point
points = points_3D.reshape(-1, 3)
# Filter out points with a disparity of 0 (indicating no measurement)
mask_map = disparity_map > disparity_map.min()
filtered_points = points[mask_map.ravel()]
# Now, filtered_points contains the 3D coordinates of each pixel
# Visualization (optional)
# You can use matplotlib to create a scatter plot of the 3D points
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(filtered_points[:, 0], filtered_points[:, 1], filtered_points[:, 2], s=1)
plt.show()
但是输出看起来像这样:
有人可以帮助我了解出了什么问题吗?我能做些什么来减轻这种疯狂的混乱输出?鉴于我没有使用任何先进技术,对于如此小的数据集来说,这是预期的行为吗?
这是我获取数据集的地方:
https://vision.middlebury.edu/stereo/data/scenes2014/
谢谢
视差矩阵:
您正在尝试计算视差,搜索 0..63 像素的范围。
但是,这些输入图片很大。对于扶手最近的部分,这些图片中某些匹配点的差异在 300 左右。
SGBM 匹配器会尽力而为,但在这些区域中它不会在 63 像素内找到良好的匹配,因此结果值将会很混乱。这就是你所看到的。也适用于图片中“太近”的所有其他部分。
您的选择:要么调大视差范围,要么对输入图像进行下采样。我推荐后者,因为它更便宜。您可以使用一些应用程序
cv.pyrDown()
来实现此目的,或者使用 cv.resize()
模式和 INTER_AREA
参数。这里是 2 倍下采样,然后搜索视差(整个像素,-1 变成 NaN)。地图看起来很合理。
如果需要,您可以将其缩放回原始分辨率。这意味着
dsize=None, fx=0.25, fy=0.25
并且还缩放视差值,使其与原始分辨率匹配。