Detect horizontal and vertical text with Tesseract

Question

I've got some images with some symbols, horizontal and vertical text and I'm trying to detect all the texts using Python and Tesseract OCR. I've made some preprocessing and the results are shown in this example of image, where the output of tesseract with its bounding boxes, captured texts and confidences are printed on.

As you can see, the script made a pretty good job, but only with the horizontal text. Is there a simple way or any Tesseract parameter that can help me to find both horizontal and vertical text at the same image? The only parameter I've setted so far is psm = 11 (sparse text).

Answer 1

Your image needs further image pre-processing by converting the green color to white. Having these lines would affect the page segmentation in Tesseract. Then, you will need to run Tesseract twice. Run it for the first time, rotate your image by 90 degrees, and run it again. You shouldn't worry about the rotation because you can still use image_to_data in Tesseract to get from where to where your text exists. I'm suggesting this to you because Tesseract page segmentation doesn't handle text in multiple directions well. If you wish, you may have a look on the PSMs.

Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR. (not implemented)
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
       bypassing hacks that are Tesseract-specific.

Detect horizontal and vertical text with Tesseract

Question

1 answers

solution1
0 2022-04-04 00:56:37

Detect horizontal and vertical text with Tesseract

Question

1 answers

solution1 0 2022-04-04 00:56:37

solution1
0 2022-04-04 00:56:37