我需要确定图像上的文字是否颠倒。我的图像示例:
我通过比较原始图像和 180 度旋转图像的置信度分数来做到这一点,但有时这种方法会给出错误的结果,所以我正在寻找替代方法来进行更多的独立检查。
我尝试计算中线上方和下方的黑色像素数量,但这种方法有时不起作用,即使对于没有任何缺陷的完美图像也是如此。
~ $ python3 Upside5.py
Image is upside-down
above_middle
7750
below_middle
9112
是否可以解决此方法? 您能否建议我确定文本是否颠倒的替代方法?
这是我的代码,用于通过计算中线上方和下方的黑色像素来确定文本是否上下颠倒:
import cv2
import numpy as np
def process_image_and_draw_lines(image_path):
def count_black_pixels_from_bottom(binary_img, N):
height, width = binary_img.shape
if N >= height:
raise ValueError("N is out of range")
return np.sum(binary_img[height - 1 - N, :] == 0)
def count_black_pixels_from_top(binary_img, K):
height, width = binary_img.shape
if K >= height:
raise ValueError("K is out of range")
return np.sum(binary_img[K, :] == 0)
def draw_line_on_row(img, row_number, color=(0, 0, 255), thickness=1):
img[row_number:row_number+thickness, :] = color
return img
def find_highest_difference_row(binary_img, from_bottom=True):
height = binary_img.shape[0]
max_diff = 0
max_diff_index = 0
if from_bottom:
for N in range(height - 1):
black_pixels_current = count_black_pixels_from_bottom(binary_img, N)
black_pixels_next = count_black_pixels_from_bottom(binary_img, N + 1)
difference = black_pixels_next - black_pixels_current
if difference > max_diff:
max_diff = difference
max_diff_index = N + 1
return height - 1 - max_diff_index
else:
for K in range(height - 1):
black_pixels_current = count_black_pixels_from_top(binary_img, K)
black_pixels_next = count_black_pixels_from_top(binary_img, K + 1)
difference = black_pixels_next - black_pixels_current
if difference > max_diff:
max_diff = difference
max_diff_index = K + 1
return max_diff_index
def process_image(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
if img is None:
print(f"Error loading image {image_path}")
return
_, binary_img = cv2.threshold(img, 128, 255, cv2.THRESH_BINARY)
top_line = find_highest_difference_row(binary_img, from_bottom=False)
bottom_line = find_highest_difference_row(binary_img, from_bottom=True)
if bottom_line > top_line:
top_line, bottom_line = bottom_line, top_line
middle_line = (top_line + bottom_line) // 2
img_copy = cv2.cvtColor(binary_img, cv2.COLOR_GRAY2BGR)
img_copy = draw_line_on_row(img_copy, top_line, color=(0, 0, 255), thickness=1)
img_copy = draw_line_on_row(img_copy, bottom_line, color=(0, 0, 255), thickness=1)
img_copy = draw_line_on_row(img_copy, middle_line, color=(0, 255, 0), thickness=1)
above_middle = np.sum(binary_img[:middle_line, :] == 0)
below_middle = np.sum(binary_img[middle_line:, :] == 0)
if above_middle > below_middle:
print("above_middle")
print(above_middle)
print("below_middle")
print(below_middle)
print("Image is not upside-down.")
elif below_middle > above_middle:
print("Image is upside-down.")
print("above_middle")
print(above_middle)
print("below_middle")
print(below_middle)
else:
print("Number of black pixels above and below the middle line are equal.")
cv2.imshow('Image with Lines', img_copy)
cv2.waitKey(0)
cv2.destroyAllWindows()
process_image(image_path)
process_image_and_draw_lines('image_5.png')`
验证 OCR 结果的一种可能的替代方法(重点分析图像本身)是提取单个字符对象,然后检查“A”、“Y”或“C”、“E”、“的特征属性” F","K" 就像人类一样决定方向。
颠倒的“C”、“E”、“F”、“K”在右侧具有主要数量的像素,而正确方向在左侧。