简体繁体中英

Bad performance with python multiprocessing using opencv and tesseract

原文 2021-04-15 18:01:27 6 1 python/ opencv/ multiprocessing/ ocr/ python-tesseract

I keep trying to parallelized my code that is purpose to extract text from various videos, implementing OCR making use of library OpenCV, specifically from 9 videos, of those 9 videos are divided into 3 categories, allowing them to be analyzed in the same way for each category. This is why I coded 3 principal functions, 1 for each category, this means that I can reuse a function for 3 videos of the same category. The functions takes between 90 and 200 seconds running independently. If I wanted to analyze 3 videos in the same execution, the result will be a much longer execution time, because the functions will execute sequentially.

It is for this reason that I decided to use the multiprocessing module, I finally got to make the functions run in parallel, however I did not get the expected performance. When I execute 2 process in parallel, 1 video for each process, execution time increases approx 10% - 15%, that's ok. But, when I execute 3 process in parallel, 1 video for each process, execution time increases drastically, in fact, I detected that the processes stopped executing, due the silence that my cpu cooler made. I checked this using htop for my linux system (ubuntu 20.04.2 LTS), and so it really was, when executing 3 processes in parallel, on a certain moment, the 6 cores of the cpu reached their limit (100%), causing processes to stop.

cpu usage - htop monitoring system

I found a way to partially fix it, I did it by separating the start time of the executions, in this way the processes at times did not use 100% of cores, getting an acceptable execution time. But, I still need to analyze more videos in parallel, 3 is still few videos. Is there any way to increase performance? I really didn't expect this performance for Python, considering it's running on a i5-8600k and 16gb Ram - 3200MHz.

important to mention:

various for loops in the functions are what cause excessive use of CPU, these loops have methods from the OpenCV library, and are required for extraction data.
processes are not transferring data between them.

If you want to check the code, you will find this in: GitHub repository

1 answers

cpu usage - htop monitoring system

important to mention:

various for loops in the functions are what cause excessive use of CPU, these loops have methods from the OpenCV library, and are required for extraction data.
processes are not transferring data between them.

If you want to check the code, you will find this in: GitHub repository

python string comparision using tesseract and opencv

Python multiprocessing queue using a lot of resources with opencv

Reading license plate from image using OpenCV Python and Tesseract

Detect regtangles in a low contrast image using opencv in python for reading by tesseract

Having problem with digits recognition in python using opencv, tesseract

extract the main heading from the image using openCV and Tesseract in python

Python OpenCV speedup for multiprocessing

Performance issues using Tesseract OCR from a Python application

Detect a phrase using OpenCV and Tesseract

Python - Performance degrades with multiprocessing

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question python string comparision using tesseract and opencv Python multiprocessing queue using a lot of resources with opencv Reading license plate from image using OpenCV Python and Tesseract Detect regtangles in a low contrast image using opencv in python for reading by tesseract Having problem with digits recognition in python using opencv, tesseract extract the main heading from the image using openCV and Tesseract in python Python OpenCV speedup for multiprocessing Performance issues using Tesseract OCR from a Python application Detect a phrase using OpenCV and Tesseract Python - Performance degrades with multiprocessing

Related Tags

Bad performance with python multiprocessing using opencv and tesseract

Question

1 answers

solution1 0 2021-05-11 07:49:07

solution1
0 2021-05-11 07:49:07