如何在Python中將OpenCV圖像傳遞給Tesseract？

Question

鑒於Python代碼調用Tesseract的C API並使用ctypes庫， 選項＃1圖像由Tesseract加載，它工作正常！ 問題出現在選項＃2中 ，當我嘗試傳遞OpenCV加載的圖像時，Tesseract返回垃圾：

from ctypes import *
import cv2

class API(Structure):
    _fields_ = []

lang = "eng"
ts = cdll.LoadLibrary("c:/Tesseract-OCR/libtesseract302.dll")
ts.TessBaseAPICreate.restype = POINTER(API)
api = ts.TessBaseAPICreate()
rc = ts.TessBaseAPIInit3(api, 'c:/Tesseract-OCR/', lang)

##### Option #1
out = ts.TessBaseAPIProcessPages(api, 'c:/Tesseract-OCR/doc/eurotext.tif', None, 0)
print 'Option #1 => ' + string_at(out)

##### Option #2
#TESS_API void  TESS_CALL TessBaseAPISetImage(TessBaseAPI* handle, const unsigned char* imagedata, int width, int height,
#                                             int bytes_per_pixel, int bytes_per_line);

im = cv2.imread('c:/Temp/Downloads/test-slim/eurotext.jpg', cv2.COLOR_BGR2GRAY)
c_ubyte_p = POINTER(c_ubyte)
##ts.TessBaseAPISetImage.argtypes = [POINTER(API), c_ubyte_p, c_int, c_int, c_int, c_int]
ts.TessBaseAPISetImage(api, im.ctypes.data_as(c_ubyte_p), 800, 1024, 3, 800 * 3)
out = ts.TessBaseAPIGetUTF8Text(api)
print 'Option #2 => ' + string_at(out)

輸出如下：

選項＃1 =>（快速）[棕色] {狐狸}跳！ 超過$ 43,456.78＃90 dog＆duck / goose，因為來自aspammer@website.com的12.5％的電子郵件是垃圾郵件。 Der ,, schnelle'braune Fuchsspringtï¬ berdenfaulen Hund。 Le renard brun«rapide»saute par-dessus le chien paresseux。 La volpe marone rapida salta sopra il cane pigro。 Elzorromarrénrépidosalta sobre el perro perezoso。 一種raposa marrom rzipida salta sobreocï preguicoso。

選項＃2 => 7？：5：*：> \\' - 〜; 2 - ; i3E：？：; i3“。i：ii ... 3;”f-i©％ :::â€:::？：=â€™:: =Â£<：7â€œ§5。<：â€œ¡¡¡¡¡¡¡¡¡ â€œ...... =：a，'; 2â€：3â€ ：3_3：l。'：â€œ：â€œ：Â：â€œ：-_：Â§ 3 ;;％Â§％AI5〜一«：Ã©:: 3％的IAA»â,¬E：

備注：

我試過python-tesseract和tightocr庫，這很好
足夠，但缺乏文件
這里我使用opencv.imread，以便有可能在矩陣上應用數學算法

有任何想法如何將OpenCV圖像（numpy.ndarray）傳遞給Tesseract？ 任何幫助都會有用。

Answer 1

我用python 3 :( bw_img是一個numpy.ndarray）

import numpy as np
import cv2
from PIL import Image
import pytesseract

...

(thresh, bw_img) = cv2.threshold(bw_img, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
...

img = Image.fromarray(bw_img)
txt = pytesseract.image_to_string(img)
print(txt)

如何在Python中將OpenCV圖像傳遞給Tesseract？

問題描述

1 個解決方案

解決方案1
13 2016-10-29 17:58:37

如何在Python中將OpenCV圖像傳遞給Tesseract？

問題描述

1 個解決方案

解決方案1 13 2016-10-29 17:58:37

解決方案1
13 2016-10-29 17:58:37