使用 pytesseract python 模塊識別圖像中的文本問題

Question

我附上了一張 300 DPI 的圖像。 我正在使用下面的代碼來提取文本，但我沒有收到任何文本。 有人知道這個問題嗎？

finalImg = Image.open('withdpi.jpg') text = pytesseract.image_to_string(finalImg)

Answer 1

讓我們觀察你的代碼在做什么。

我們需要查看文本的哪一部分被本地化和檢測。
為了理解代碼行為，我們將使用image_to_data function。
image_to_data將顯示檢測到圖像的哪一部分。

 # Open the image and convert it to the gray-scale finalImg = Image.open('hP5Pt.jpg').convert('L') # Initialize ImageDraw class for displaying the detected rectangle in the image finalImgDraw = ImageDraw.Draw(finalImg) # OCR detection d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT) # Get ROI part from the detection n_boxes = len(d['level']) # For each detected part for i in range(n_boxes): # Get the localized region (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i]) # Initialize shape for displaying the current localized region shape = [(x, y), (w, h)] # Draw the region finalImgDraw.rectangle(shape, outline="red") # Display finalImg.show() # OCR "psm 6: Assume a single uniform block of text." txt = pytesseract.image_to_string(cropped, config="--psm 6") # Result print(txt)

結果：

```
 i I
```

```

所以結果是圖像本身顯示沒有被檢測到。 代碼不起作用。 output 未顯示所需結果。
可能有各種原因。
以下是輸入圖像的一些事實：
- 二進制圖像。
- 大長方形神器。
- 文字有點膨脹。

未經測試，我們無法知道圖像是否需要預處理。
我們確信大黑矩形是一個神器。 我們需要刪除工件。 一種解決方案是選擇圖像的一部分。
對於 select 部分圖像，我們需要使用crop和一些試錯法來找到 roi。
- 如果我們在高度方面將圖像分為兩塊。 我們不希望其他工件包含一半。
- 乍一看，我們想要（ 0 -> height/2 ）。 如果您使用這些值，您可以看到確切的文本位置介於 ( height/6 -> height/4 )
結果將是：
```
 $1,582
```
代碼：

 # Open the image and convert it to the gray-scale finalImg = Image.open('hP5Pt.jpg').convert('L') # Get height and width of the image w, h = finalImg.size # Get part of the desired text finalImg = finalImg.crop((0, int(h/6), w, int(h/4))) # Initialize ImageDraw class for displaying the detected rectangle in the image finalImgDraw = ImageDraw.Draw(finalImg) # OCR detection d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT) # Get ROI part from the detection n_boxes = len(d['level']) # For each detected part for i in range(n_boxes): # Get the localized region (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i]) # Initialize shape for displaying the current localized region shape = [(x, y), (w, h)] # Draw the region finalImgDraw.rectangle(shape, outline="red") # Display finalImg.show() # OCR "psm 6: Assume a single uniform block of text." txt = pytesseract.image_to_string(cropped, config="--psm 6") # Result print(txt)

如果您無法獲得與我相同的解決方案，則需要檢查您的 pytesseract 版本，使用：

print(pytesseract.get_tesseract_version())

對我來說，結果是4.1.1

使用 pytesseract python 模塊識別圖像中的文本問題

問題描述

1 個解決方案

解決方案1
1 2021-02-26 14:04:53

使用 pytesseract python 模塊識別圖像中的文本問題

問題描述

1 個解決方案

解決方案1 1 2021-02-26 14:04:53

解決方案1
1 2021-02-26 14:04:53