I have a lot of images extracted from Search engine, and I am use OCR to perform descent text extraction from these image, but There are images that do not contain text.
Thus I would like to determine if an image simply contains text or not in python, and if it doesn't, i wouldn't have to perform OCR on it. Ideally this method would have a high recall.
Use pytteseract. Something like this:
from PIL import Image
import pytesseract
def contains_text(image_path):
text = pytesseract.image_to_string(Image.open(image_path))
if text == "":
return False # No text detected
else:
return text
I do not know of a way to detect that there is no text without trying to perform OCR (like above).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.