[英]Python tesseract cannot read numbers from image
I have a python script that works for some images with numbers, it reads them correctly.我有一个 python 脚本适用于一些带有数字的图像,它可以正确读取它们。 The type of images that work are here: Working image I'm trying to use the script with a new kind of images with numbers only but it is not working.
可用的图像类型在这里: 工作图像我正在尝试将脚本与一种仅带有数字的新型图像一起使用,但它不起作用。 The new images type is here: Non working image
新的图像类型在这里:非工作图像
My script is as following:我的脚本如下:
try:
from PIL import Image
from PIL import ImageEnhance
except ImportError:
import Image
import pytesseract
black = (0,0,0)
white = (255,255,255)
threshold = (160,160,160)
# Open input image in grayscale mode and get its pixels.
img = Image.open("./in/web_search.jpg").convert("LA")
# multiply each pixel by 1.2
out = img.point(lambda i: i * 1.3)
enh = ImageEnhance.Contrast(out)
enh.enhance(1.3).show("30% more contrast")
pixels = out.getdata()
newPixels = []
# Compare each pixel
for pixel in pixels:
if pixel < threshold:
newPixels.append(black)
else:
newPixels.append(white)
# Create and save new image.
newImg = Image.new("RGB",out.size)
newImg.putdata(newPixels)
newImg.save("./out/web_search.jpg")
pytesseract.pytesseract.tesseract_cmd = r'/usr/bin/tesseract'
print("-----------------------")
print(pytesseract.image_to_string(Image.open('./out/web_search.jpg'), lang='eng', config='--psm 10 --oem 3 -c tessedit_char_whitelist=1234567890 --tessdata-dir="/usr/share/tesseract-ocr/4.00/tessdata/"'))
print("-----------------------")
The result with my new image is:我的新图像的结果是:
-----------------------
Riemer gaat bee 6 eee
-----------------------
Any help please?请问有什么帮助吗? Thanks.
谢谢。
You'll probably need to do some work to get it to pick that up.你可能需要做一些工作才能让它接受它。 Some things you can do are:
您可以做的一些事情是:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.