I have to extract text from some screenshot images like this Screenshot.png
First I have to detect and crop the powershell from the screenshot like this (used yolov5 to detect) Cropped.JPEG . after croping the image i am saving it as JPEG.
Then i have to extract text from this cropped image.
I know for tesseract the background should be light and the text should be dark so i inverted the image inverted image.JPEG . After doing this, for few images i am getting expected output but for most of the images i am not getting desired output.
i have tried following methods.
I am not sure what i am doing wrong...
If you are using pytessaract then there is no need for thresholding. just resizing and conversion to grayscale will be enough to get the desired output.
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
text = pytesseract.image_to_string(gray, lang='eng')
text_to_arr = (text.split('\n'))
space_to_empty = [x.strip() for x in text_to_arr]
space_clean_list = [x.lower() for x in space_to_empty if x]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.