简体   繁体   English

实时 OCR 延迟

[英]Real time OCR lag

im trying to capture position of license plate with webcam feed using YOLOv4 tiny then input the result to easyOCR to extract the characters.我试图使用 YOLOv4 tiny 捕获带有网络摄像头馈送的车牌 position,然后将结果输入到 easyOCR 以提取字符。 The detection works well in real time, however when i apply the OCR the webcam stream become really laggy.实时检测效果很好,但是当我应用 OCR 时,网络摄像头 stream 变得非常滞后。 Is there anyway i can improve this code to make it make it less laggy?无论如何我可以改进此代码以使其不那么滞后吗?

my YOLOv4 detection我的 YOLOv4 检测

#detection
while 1:
    #_, pre_img = cap.read()
    #pre_img= cv2.resize(pre_img, (640, 480))
    _, img = cap.read()
    #img = cv2.flip(pre_img,1)
    hight, width, _ = img.shape
    blob = cv2.dnn.blobFromImage(img, 1 / 255, (416, 416), (0, 0, 0), swapRB=True, crop=False)

    net.setInput(blob)

    output_layers_name = net.getUnconnectedOutLayersNames()

    layerOutputs = net.forward(output_layers_name)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.7:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .5, .4)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.5:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)

                x = int(center_x - w / 2)
                y = int(center_y - h / 2)

                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .8, .4)
    font = cv2.FONT_HERSHEY_PLAIN
    colors = np.random.uniform(0, 255, size=(len(boxes), 3))
    if len(indexes) > 0:
        for i in indexes.flatten():
            x, y, w, h = boxes[i]
            label = str(classes[class_ids[i]])
            confidence = str(round(confidences[i], 2))
            color = colors[i]
            cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
           # detection= cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
            detected_image = img[y:y+h, x:x+w]
            cv2.putText(img, label + " " + confidence, (x, y + 400), font, 2, color, 2)
            #print(detected_image)
            cv2.imshow('detection',detected_image)

            cv2.imwrite('lp5.jpg',detected_image)
            cropped_image = cv2.imread('lp5.jpg')
            cv2.waitKey(5000)
            print("system is waiting")
            result = OCR(cropped_image)
            print(result)

easy OCR function易OCR function

def OCR(cropped_image):
    reader = easyocr.Reader(['en'], gpu=False)  # what the reader expect from  the image
    result = reader.readtext(cropped_image)
    text = ''
    for result in result:
        text += result[1] + ' '

    spliced = (remove(text))
    return spliced

You are essentially saying "the while loop must be fast."您实质上是在说“ while循环必须很快”。 And of course the OCR() call is a bit slow.当然 OCR() 调用有点慢。 Ok, good.好的,很好。

Don't call OCR() from within the loop.不要在循环内调用 OCR()。

Rather, enqueue a request, and let another thread / process / host worry about the OCR computation, while the loop quickly continues upon its merry way.相反,将请求排入队列,让另一个线程/进程/主机担心 OCR 计算,同时循环快速继续其愉快的方式。

You could use a threaded Queue , or a subprocess , or blast it over to RabbitMQ or Kafka.您可以使用线程Queuesubprocess ,或将其发送到 RabbitMQ 或 Kafka。 The simplest approach would be to simply overwrite /tmp/cropped_image.png within the loop, and have another process notice such updates and (slowly) call OCR(), appending the results to a log file.最简单的方法是在循环中简单地覆盖/tmp/cropped_image.png ,并让另一个进程注意到此类更新并(缓慢)调用 OCR(),将结果附加到日志文件中。

There might be a couple of updates to the image file while a single OCR call is in progress, and that's fine.在进行单个 OCR 调用时,可能会对图像文件进行几次更新,这很好。 The two are decoupled from one another, each progressing at their own pace.两者相互脱钩,各自以自己的步伐前进。 Downside of a queue would be OCR sometimes falling behind -- you actually want to shed load by skipping some (redundant) cropped images.队列的缺点是 OCR 有时会落后——您实际上通过跳过一些(冗余)裁剪图像来减轻负载。


The two are racing, and that's fine.两人在比赛,这很好。 But take care to do things in atomic fashion -- you wouldn't want to OCR an image that starts with one frame and ends with part of a subsequent frame.但请注意以原子方式执行操作——您不希望 OCR 以一帧开始并以后续帧的一部分结束的图像。 Write to a temp file and, after close(), use os.rename() to atomically make those pixels available under the name that the OCR daemon will read from.写入临时文件,并在 close() 之后,使用os.rename()以原子方式使这些像素以 OCR 守护程序将从中读取的名称可用。 Once it has a file descriptor open for read, it will have no problem reading to EOF without interference, the kernel takes care of that for us.一旦它打开了一个文件描述符以供读取,它将毫无问题地读取到 EOF 而不会受到干扰,kernel 会为我们解决这个问题。

There are several points.有几点。

  1. cv2.waitKey(5000) in your loop causes some delay even though you are pressing a key.即使您按下一个键,循环中的cv2.waitKey(5000)也会导致一些延迟。 So remove it if you are not debugging.因此,如果您不调试,请将其删除。

  2. You are saving a detected region into a JPEG image and loading it each time.您将检测到的区域保存到 JPEG 图像中并每次都加载它。 Do not do that - just pass the cv image(Numpy array) into the OCR module.不要那样做 - 只需将 cv 图像(Numpy 数组)传递到 OCR 模块。

  3. EasyOCR is a DNN model based on ResNet, but you are not using a GPU( gpu=False ). EasyOCR 是基于 ResNet 的 DNN model,但您没有使用 GPU ( gpu=False )。 So you should use GPU.所以你应该使用 GPU。 See this benchmark by Liao.请参阅 Liao 的这个基准

  4. You are creating many EasyOCR Reader instances inside a loop.您正在循环内创建许多 EasyOCR Reader 实例。 Create only one instance before the loop and reuse it inside a loop.在循环之前只创建一个实例并在循环中重复使用它。 I think this is the most important bottleneck.我认为这是最重要的瓶颈。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM