需要分别从图像中分割每个数字

问题描述 投票:0回答:1

我使用 MNIST 数据集创建了一个 CNN 模型。我想对图像中出现的数字序列进行预测。该技术涉及分割每个图像并将其输入模型,但我在从图像中分割数字时面临困难,因为存在两种不同类型的图像。我需要一种强大的技术来消除图像中存在的所有噪声和阴影,并分别分割每个数字。 我也在这里分享图像。 我正在寻找强大的技术和代码。

enter image description here

enter image description here

我正在尝试这个代码和技术,但它不适用于所附的图像

def segment_and_display_digits(image_path):# Read imageimg = cv2.imread(image_path)gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Get image dimensions
height, width = gray.shape
total_area = height * width

# Apply adaptive thresholding
thresh = cv2.adaptiveThreshold(
    gray,
    255,
    cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
    cv2.THRESH_BINARY_INV,
    21,  # Block size
    10   # C constant
)

# Find contours
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Filter contours based on area
valid_contours = []
min_area = total_area * 0.001  # Minimum 0.1% of image area
max_area = total_area * 0.5    # Maximum 50% of image area

for cnt in contours:
    area = cv2.contourArea(cnt)
    if min_area < area < max_area:
        x, y, w, h = cv2.boundingRect(cnt)
        aspect_ratio = w / float(h)
        # Check if aspect ratio is reasonable for a digit (not too wide or tall)
        if 0.2 < aspect_ratio < 2:
            valid_contours.append(cnt)

# Sort contours from left to right
valid_contours = sorted(valid_contours, key=lambda x: cv2.boundingRect(x)[0])

# Extract and display digits
digits = []
padding = int(min(height, width) * 0.02)  # Adaptive padding based on image size

for cnt in valid_contours:
    x, y, w, h = cv2.boundingRect(cnt)
    # Add padding while keeping within image bounds
    x1 = max(0, x - padding)
    y1 = max(0, y - padding)
    x2 = min(width, x + w + padding)
    y2 = min(height, y + h + padding)
    digit = img[y1:y2, x1:x2]
    digits.append(digit)

# Display results
if digits:
    # Create visualization of original image with detected digits
    img_with_boxes = img.copy()
    for cnt in valid_contours:
        x, y, w, h = cv2.boundingRect(cnt)
        cv2.rectangle(img_with_boxes, (x, y), (x+w, y+h), (0, 255, 0), 2)
    
    # Plot original image with boxes and segmented digits
    plt.figure(figsize=(15, 5))
    
    # Original image with boxes
    plt.subplot(2, 1, 1)
    plt.imshow(cv2.cvtColor(img_with_boxes, cv2.COLOR_BGR2RGB))
    plt.title('Detected Digits')
    plt.axis('off')
    
    # Segmented digits
    plt.subplot(2, 1, 2)
    for i, digit in enumerate(digits):
        plt.subplot(2, len(digits), len(digits) + i + 1)
        plt.imshow(cv2.cvtColor(digit, cv2.COLOR_BGR2RGB))
        plt.axis('off')
        plt.title(f'Digit {i+1}')
    
    plt.tight_layout()
    plt.show()
else:
    print("No digits found in the image")
machine-learning computer-vision ocr handwriting-recognition
1个回答
0
投票

为了补偿不均匀的照明,标准技术是首先估计照明,然后划分。

背景是一张白纸,太棒了。我将用中值模糊来估计照明。

im = cv.imread("KifRNuGy.jpg")

illumination = cv.medianBlur(im, 101)

compensated = im / illumination

# arbitrary 0.8 to keep the bright background within range
compensated = (0.8 * 255 * np.clip(compensated, 0, 1)).astype(np.uint8)

compensated

© www.soinside.com 2019 - 2024. All rights reserved.