How to convert scanned PDF to searchable PDF in Python? [Environment:Windows]

Question

I have scanned pdf, I just want to convert searchable PDF by using python. I can convert on Abode but I want to do programmatically and it should be open-source. Can anyone please help to convert the PDF?

Note: It should not remove any image on PDF.

Answer 1

I have solved the problem by using wand package. Example code:

from pdf2image import convert_from_path

from wand.image import Image as WandImage

TIFFPdf = convert_from_path(pdfFileName)
pageNumber = 0
for img in TIFFPdf:
        pageNumber = pageNumber + 1
   
        img1 = WandImage()
        img1.read(filename='suresh.pdf' + '[' + str(pageNumber) + ']', resolution=300)
        img1.compression = 'group4'
        img1.save(filename=str(pageNumber) + '.tif')

How to convert scanned PDF to searchable PDF in Python? [Environment:Windows]

Question

1 answers

solution1
0 2021-01-29 10:02:30

How to convert scanned PDF to searchable PDF in Python? [Environment:Windows]

Question

1 answers

solution1 0 2021-01-29 10:02:30

solution1
0 2021-01-29 10:02:30