我有一个图像是表格的一行。我将其作为另一个深度学习模型的输入。
我打算做的是使用OpenCV划分所有的单元格,然后分别提取这些单元格中的文本
预期的最终输出是这样的:
['1','Ace A1-Smartphone Batch : Batch 1','8517','500 Nos 500 Nox','6,000.00','Nos','3000000.00']
现在我已经使用 Open CV 识别了水平线和垂直线,输出如下所示:
现在我想要这个图像中每个矩形的坐标,所以我可以在原始图像上绘制它并提取单元格。我怎样才能做到这一点?
这就是我用来在第二张图片中找到水平线和垂直线的东西:
import cv2
import numpy as np
from google.colab.patches import cv2_imshow
document_img = cv2.imread("2.jpg")
table_list = [np.array(document_img, copy=True)]
for each_table in table_list:
img = cv2.cvtColor(each_table, cv2.COLOR_BGR2GRAY)
img_height, img_width = img.shape
thresh, img_bin = cv2.threshold(img, 180, 255, cv2.THRESH_BINARY)
img_bin_inv = 255 - img_bin
kernel_len_ver = max(10, img_height // 50)
kernel_len_hor = max(10, img_width // 50)
ver_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_len_ver)) # shape (kernel_len, 1) inverted! xD
hor_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_len_hor, 1)) # shape (1,kernel_ken) xD
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
image_1 = cv2.erode(img_bin_inv, ver_kernel, iterations=3)
vertical_lines = cv2.dilate(image_1, ver_kernel, iterations=4)
cv2_imshow(vertical_lines)
print("\n\n\n")
# HoriZontal lines
image_2 = cv2.erode(img_bin_inv, hor_kernel, iterations=3)
horizontal_lines = cv2.dilate(image_2, hor_kernel, iterations=4)
cv2_imshow(horizontal_lines)
print("\n\n\n")
img_vh = cv2.addWeighted(vertical_lines, 0.5, horizontal_lines, 0.5, 0.0)
img_vh = cv2.dilate(img_vh, kernel, iterations=5)
thresh, img_vh = (cv2.threshold(img_vh, 50, 255, cv2.THRESH_BINARY))
cv2_imshow(img_vh)