使用 Tesseract,我尝试读取右侧的数字,例如:
const COORDINATES = [
MORE_INFO_LABELS: {
x: 740,
y: 165,
w: 112,
h: 326,
},
];
const worker = await createWorker("eng", OEM.TESSERACT_LSTM_COMBINED);
await worker.setParameters({
tessedit_char_whitelist: "0123456789",
tessedit_pageseg_mode: PSM.SINGLE_LINE,
});
const moreInfoScreenshot = await cv.imdecodeAsync(
await fs.readFile("test.png"),
cv.IMREAD_GRAYSCALE
);
const binaryImage = moreInfoScreenshot.adaptiveThreshold(
255,
cv.ADAPTIVE_THRESH_GAUSSIAN_C,
cv.THRESH_BINARY_INV,
11,
2
);
const moreInfoScreenshotPNG = await cv.imencodeAsync(".png", binaryImage);
await cv.imwriteAsync("test-fmt.png", binaryImage);
function coordinatesToRectangle(coordinates: Required<Coordinate>) {
return {
top: coordinates.y,
left: coordinates.x,
width: coordinates.w,
height: coordinates.h,
};
}
const {
data: { text: moreInfoText },
} = await options.worker.recognize(moreInfoScreenshotPNG, {
rectangle: coordinatesToRectangle(COORDINATES.MORE_INFO_LABELS),
});
输出图像看起来像。问题是 tesseract 无法读取较小的数字 (
moreInfoText: '100408218\n18870369\n26783840937\n3330133360\n215735\n'
)。我怎样才能确保这些内容被正确阅读?
你很幸运,你的图像背景大部分是蓝色的。当读取带有颜色的图像时(警告:OpenCV默认为BGR,而不是RGB),您可以使用
cv.split()
提取每个颜色通道:
如您所见,由于背景是蓝色,所以红色已经非常干净了。现在您可以尝试在其上运行 Tesseract,或者另外设置阈值并反转图像。我使用 OpenCV-python 来演示,但 OpenCV 功能是相同的:
import cv2 as cv
img = cv.imread("image.png")
_, _, r = cv.split(img)
_, thresh = cv.threshold(r, 85, 255,cv.THRESH_BINARY)
现在图像已阈值化,您可以使用
cv.bitwise_not()
反转颜色:
cv.bitwise_not(thresh, thresh)