如何判断图像中的文字是否上下颠倒?

问题描述 投票:0回答:1

我需要确定图像上的文字是否颠倒。我的图像示例:

1 enter image description here

2 enter image description here

3

4 enter image description here

5

我通过比较原始图像和 180 度旋转图像的置信度分数来做到这一点,但有时这种方法会给出错误的结果,所以我正在寻找替代方法来进行更多的独立检查。

我尝试计算中线上方和下方的黑色像素数量,但这种方法有时不起作用,即使对于没有任何缺陷的完美图像也是如此。

~ $ python3 Upside5.py
Image is upside-down
above_middle
7750
below_middle
9112

完美画面

是否可以解决此方法? 您能否建议我确定文本是否颠倒的替代方法?

这是我的代码,用于通过计算中线上方和下方的黑色像素来确定文本是否上下颠倒:

import cv2
import numpy as np

def process_image_and_draw_lines(image_path):
    def count_black_pixels_from_bottom(binary_img, N):
        height, width = binary_img.shape
        if N >= height:
            raise ValueError("N is out of range")
        return np.sum(binary_img[height - 1 - N, :] == 0)

    def count_black_pixels_from_top(binary_img, K):
        height, width = binary_img.shape
        if K >= height:
            raise ValueError("K is out of range")
        return np.sum(binary_img[K, :] == 0)

    def draw_line_on_row(img, row_number, color=(0, 0, 255), thickness=1):
        img[row_number:row_number+thickness, :] = color
        return img

    def find_highest_difference_row(binary_img, from_bottom=True):
        height = binary_img.shape[0]
        max_diff = 0
        max_diff_index = 0

        if from_bottom:
            for N in range(height - 1):
                black_pixels_current = count_black_pixels_from_bottom(binary_img, N)
                black_pixels_next = count_black_pixels_from_bottom(binary_img, N + 1)
                difference = black_pixels_next - black_pixels_current
                if difference > max_diff:
                    max_diff = difference
                    max_diff_index = N + 1
            return height - 1 - max_diff_index
        else:
            for K in range(height - 1):
                black_pixels_current = count_black_pixels_from_top(binary_img, K)
                black_pixels_next = count_black_pixels_from_top(binary_img, K + 1)
                difference = black_pixels_next - black_pixels_current
                if difference > max_diff:
                    max_diff = difference
                    max_diff_index = K + 1
            return max_diff_index

    def process_image(image_path):
        img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
        if img is None:
            print(f"Error loading image {image_path}")
            return
        
        _, binary_img = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY)
        top_line = find_highest_difference_row(binary_img, from_bottom=False)
        bottom_line = find_highest_difference_row(binary_img, from_bottom=True)

        if bottom_line > top_line:
            top_line, bottom_line = bottom_line, top_line

        middle_line = (top_line + bottom_line) // 2

        img_copy = cv2.cvtColor(binary_img, cv2.COLOR_GRAY2BGR)
        img_copy = draw_line_on_row(img_copy, top_line, color=(0, 0, 255), thickness=1)
        img_copy = draw_line_on_row(img_copy, bottom_line, color=(0, 0, 255), thickness=1)
        img_copy = draw_line_on_row(img_copy, middle_line, color=(0, 255, 0), thickness=1)

        above_middle = np.sum(binary_img[:middle_line, :] == 0)
        below_middle = np.sum(binary_img[middle_line:, :] == 0)

        if above_middle > below_middle:
            print("above_middle")
            print(above_middle)
            print("below_middle")
            print(below_middle)
            print("Image is not upside-down.")
        elif below_middle > above_middle:
            print("Image is upside-down.")
            print("above_middle")
            print(above_middle)
            print("below_middle")
            print(below_middle)
        else:
            print("Number of black pixels above and below the middle line are equal.")

        cv2.imshow('Image with Lines', img_copy)
        cv2.waitKey(0)
        cv2.destroyAllWindows()

    process_image(image_path)

process_image_and_draw_lines('image_5.png')`
image text ocr
1个回答
0
投票

验证 OCR 结果的一种可能的替代方法(重点分析图像本身)是提取单个字符对象,然后检查“A”、“Y”或“C”、“E”、“的特征属性” F","K" 就像人类一样决定方向。

颠倒的“C”、“E”、“F”、“K”在右侧具有主要数量的像素,而正确方向在左侧。

© www.soinside.com 2019 - 2024. All rights reserved.