简体   繁体   中英

How do I get pyTesseract to only get numbers from an image

I've been trying to make a sudoku solver taking an input of sudoku game in png form. I've tried to turn the digits in image into numbers so that I can put them on a list and implement an algorithm afterwards. However, pyTesseract doesn't clearly find the numbers in image and gives unreliable readings, even though the numbers look so clear and the image is computer-generated. How can I force the pyTesseract to look only for numbers and get the numbers as well as their positions correctly. You can see the example sudoku image as well as the code in the image below. 在此处输入图像描述 You can also check the code in the url below: https://colab.research.google.com/drive/1I3Gh2TfxMXJyyH2M0ExrMBbffirfg7vq?usp=sharing

If you remove the gridlines and use this line, everything will look perfect:

text = pytesseract.image_to_string(gray,lang='eng',config='-c tessedit_char_whitelist=123456789 --psm 6')

tessedit_char_whitelist is used to tell the engine that you prefer numerical results. You may also use image_to_data to get the location of the output.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM