[英]Read numbers on image using OCR python
I am trying to extract numbers on images using OpenCV in Python and tesseract.我正在尝试在 Python 和 tesseract 中使用 OpenCV 提取图像上的数字。 Here's my try but I got nothing.
这是我的尝试,但我什么也没得到。 The code doesn't return the expected numbers
该代码未返回预期的数字
import fitz, pytesseract, os, re
import cv2
sTemp = "Number.png"
directory = '.\MyFolder'
def useMagick(img):
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
command = 'magick convert {} -resize 1024x640 -density 300 -quality 100 {}'.format(img, sTemp)
os.system(command)
def readNumber(img):
img = cv2.imread(img)
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
txt = pytesseract.image_to_string(gry)
print(txt)
try:
return re.findall(r'\d+\s?\/\s?(\d+)', txt)[0]
except:
blur = cv2.GaussianBlur(gry, (3,3), 0)
txt = pytesseract.image_to_string(blur)
try:
return re.findall(r'\d+\s?\/\s?(\d+)', txt)[0]
except:
return 'REVIEW'
sPath = os.path.join(directory, sTemp)
useMagick(sPath)
x = readNumber(sPath)
print(x)
Here's sample of the images这是图像示例
The code doesn't return any digits.该代码不返回任何数字。 How can I improve the quality of such an image to be able to extract the numbers?
我怎样才能提高这种图像的质量才能提取数字?
After many searches, I could finally solve the problem经过多次搜索,我终于可以解决问题
import cv2
import numpy as np
import pytesseract
import os, re
sImagesPath = r'MyFolder/'
mylist = []
def replace_chars(text):
list_of_numbers = re.findall(r'\d+', text)
result_number = ''.join(list_of_numbers)
return result_number
for root, dirs, file_names in os.walk(sImagesPath):
for file_name in file_names:
img = cv2.imread(sImagesPath + file_name)
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.adaptiveThreshold(gry, 181, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 13, 10)
txt = pytesseract.image_to_string(thr, lang='eng',config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
mylist.append(replace_chars(txt))
print(replace_chars(txt))
with open('Output.txt', 'w') as f:
for i in mylist:
s = ''.join(map(str, i))
f.write(s + '\n')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.