简体   繁体   English

如何正确阅读easyocr的文本?

[英]How to read the text by easyocr correctly?

原始图像 I am trying to read images from a camera module and so far I got to process the image this way using adaptive filtering.我正在尝试从相机模块读取图像,到目前为止,我必须使用自适应滤波以这种方式处理图像。 Besides, I did a lot of manipulation to crop the ROI and read the text.此外,我做了很多操作来裁剪 ROI 并阅读文本。 However, it is reading the number but not the units beside the numbers, which are comparatively small in size.但是,它正在读取数字而不是数字旁边的单位,这些单位的大小相对较小。 How do I solve this problem?我该如何解决这个问题? 输出

import easyocr 
import cv2
import numpy as np

import matplotlib.pyplot as plt
import time
import urllib.request
url = 'http://192.168.137.108/cam-hi.jpg'
while True:
    img_resp=urllib.request.urlopen(url)
    imgnp=np.array(bytearray(img_resp.read()),dtype=np.uint8)
    image = cv2.imdecode(imgnp,-1)
    image = cv2.medianBlur(image,7)
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)    #to gray convert
    th3 = cv2.adaptiveThreshold(gray_image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
                cv2.THRESH_BINARY,11,2) #adaptive threshold gaussian filter used
    kernel = np.ones((5,5),np.uint8)
    opening = cv2.morphologyEx(th3, cv2.MORPH_OPEN, kernel)
    

    x = 0   #to save the position, width and height for contours(later used)
    y = 0
    w = 0
    h = 0

    cnts = cv2.findContours(opening, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    threshold =  10
    font = cv2.FONT_HERSHEY_SIMPLEX  
    org = (50, 50) 
    fontScale = 1 
    color = (0, 0, 0)
    thickness = 2
        
    for c in cnts:
        
        approx = cv2.approxPolyDP(c,0.01*cv2.arcLength(c,True),True)
        area = cv2.contourArea(c)   
        if  len(approx) == 4 and area > 100000:   #manual area value used to find ROI for rectangular contours
        
            cv2.drawContours(image,[c], 0, (0,255,0), 3)
            n = approx.ravel()
            font = cv2.FONT_HERSHEY_SIMPLEX
            (x, y, w, h) = cv2.boundingRect(c)
            old_img = opening[y:y+h, x:x+w]  #selecting the ROI
            width, height = old_img.shape
            cropped_img = old_img[50:int(width/2), 0:height] #cropping half of the frame of ROI to just focus on the number
            
            new = reader.readtext(cropped_img)   #reading text using easyocr
            if(new == []): 
                text = 'none'
            else:
                text = new
                print(text)
#                 cv2.rectangle(cropped_img, tuple(text[0][0][0]), tuple(text[0][0][2]), (0, 0, 0), 2)
                if(text[0][2] > 0.5): #checking the confidence level
                    
                    cv2.putText(cropped_img, text[0][1], org, font, fontScale, color, thickness, cv2.LINE_AA)        
            cv2.imshow('frame1',cropped_img)
    key = cv2.waitKey(5) 

    if key == 27:
        break

cv2.waitKey(0)
cv2.destroyAllWindows()
    
    

This is the best I could get.这是我能得到的最好的。 The Greek symbol ' mu ' is identified as ' p '.希腊符号“ mu ”被标识为“ p ”。 I also tried searching for Greek language model related to easyocr but could not find any.我还尝试搜索与easyocr相关的希腊语言模型,但找不到。

在此处输入图像描述

Here is what I did:这是我所做的:

  • Performed Otsu Threshold on the entire image对整个图像执行 Otsu Threshold
  • Selected contour with largest area and cropped it选择面积最大的轮廓并裁剪它
  • Converted the cropped image to LAB color space将裁剪后的图像转换为 LAB 颜色空间
  • Manually performed binary threshold on A-channel在 A 通道上手动执行二进制阈值

I got the following:我得到以下信息:

在此处输入图像描述

Passed this image as input to easyocr :将此图像作为输入传递给easyocr

from easyocr import Reader
reader = Reader(['en'])

# input is the cropped image
results = reader.readtext(crop_img)

# convert to LAB space
lab = cv2.cvtColor(crop_img, cv2.COLOR_BGR2LAB)

# threshold on A-channel
r,th = cv2.threshold(lab[:,:,1],125,255,cv2.THRESH_BINARY_INV)

# create copy of cropped image
crop_img2 = crop_img.copy()

# draw only first 5 results for clarity
# borrowed from: https://pyimagesearch.com/2020/09/14/getting-started-with-easyocr-for-optical-character-recognition/
for (bbox, text, prob) in results[:5]:
  (tl, tr, br, bl) = bbox
  tl = (int(tl[0]), int(tl[1]))
  tr = (int(tr[0]), int(tr[1]))
  br = (int(br[0]), int(br[1]))
  bl = (int(bl[0]), int(bl[1]))
  crop_img2 = cv2.rectangle(crop_img2, tl, br, (0, 0, 255), 3)
  crop_img2 = cv2.putText(crop_img2, text, (tl[0], tl[1] - 20), cv2.FONT_HERSHEY_SIMPLEX, 1.1, (0, 0, 0), 5)

If you try to clear the image and pass path to below method it work try如果您尝试清除图像并将路径传递给以下方法,则可以尝试

def text_extraction(image, lang_code='en'):
    reader = easyocr.Reader([lang_code], gpu=False)
    roi = cv2.imread(image)#[85:731, 265:1275]
    output = reader.readtext(roi)
    # it returns list of tuple with ([x,y coordinates],text,text_threshold)
    return output

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM