How do I get pyTesseract to only get numbers from an image

Question

I've been trying to make a sudoku solver taking an input of sudoku game in png form. I've tried to turn the digits in image into numbers so that I can put them on a list and implement an algorithm afterwards. However, pyTesseract doesn't clearly find the numbers in image and gives unreliable readings, even though the numbers look so clear and the image is computer-generated. How can I force the pyTesseract to look only for numbers and get the numbers as well as their positions correctly. You can see the example sudoku image as well as the code in the image below. You can also check the code in the url below: https://colab.research.google.com/drive/1I3Gh2TfxMXJyyH2M0ExrMBbffirfg7vq?usp=sharing

Answer 1

If you remove the gridlines and use this line, everything will look perfect:

text = pytesseract.image_to_string(gray,lang='eng',config='-c tessedit_char_whitelist=123456789 --psm 6')

tessedit_char_whitelist is used to tell the engine that you prefer numerical results. You may also use image_to_data to get the location of the output.

How do I get pyTesseract to only get numbers from an image

Question

1 answers

solution1
1 2022-03-16 20:11:27

How do I get pyTesseract to only get numbers from an image

Question

1 answers

solution1 1 2022-03-16 20:11:27

solution1
1 2022-03-16 20:11:27