I know there is a way to convert scannedPDF document to OCR'd PDF document using Ghostscript , using the below command:
> bin\gswin64c.exe -sDEVICE=pdfocr32 -o D:\OCR\outputOCRdPDF.pdf -r600 -dDownScaleFactor=3 InputScannedPDF.pdf
and it uses the Tesseract open source to do that. As per the available devices of GS for OCR , It converts to another PDF document, and not just simple text (whereas tesseract does it to plain text as well).
Looks like I am missing something related to usage. Please correct me here or provide your valuable suggestions to convert to simple text instead of PDF
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.