Tesseract的Python准确性

Question

我已经运行了tesseract ocr将图像文件转换为字符串。

现在我放完了

如何比较原始PNG文件和输出文本文件是否正确

basewidth = 2700
img = Image.open('D:OCR\\page1.png')
wpercent = (basewidth/float(img.size[0]))
hsize = int((float(img.size[1])*float(wpercent)))
img = img.resize((basewidth,hsize), PIL.Image.ANTIALIAS)
img.save('page1_zoom.png') 
print(image_to_string(Image.open('D:\page1_zoom.png')))

Answer 1

如何检查是否正确？

当然，您需要一些手动的基准/黄金数据来与结果进行比较。 您将需要测试数据或至少要验证的参数。

Test cases could be something like: 
 1. Whole textual data 
 2. No of lines 
 3. No of Paragraphs 
 4. Position of text

Tesseract与Google OCR：

如果您想使用其他OCR测试tesseract的准确性，则可以尝试使用Google OCR，其结果要比tesseract更好（尽管它是基于它的）

Tesseract培训：

Tesseract does provide feature of training to improve the accuracy of results.

Tesseract的Python准确性

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-03-07 13:07:10

Tesseract的Python准确性

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-03-07 13:07:10

解决方案1
1 已采纳 2017-03-07 13:07:10