简体繁体中英

Python module that can remove the OCRed text layer from one pdf file and move it to another?

原文 2014-09-29 22:31:27 9 1 python/ pdf

I have two pdf files, which are almost the same, except that the first one has OCRed text and the other doesn't, and they have different compressions.

The reason I want to do that is because there is some error in the first file's OCRed text, and the file uses the OCRed text to cover the corresponding image, which makes me unable to know what the correct text is. This is how the second file can help me.

I would like to

make the first file show the image, with the OCRed text hidden and not covering the image.
Alternatively, move the OCRed text from the first file to the second.
Alternatively, remove the OCRed text from the first file, and then re-OCR it, since Adobe Acrobat can't re-OCR a pdf file with OCRed text already.

So I wonder if there is a Python module that can move the OCRed text layer from the first file to the second, while removing the OCRed text layer away from the first file?

If there is no, what languages may have such libraries?

Thanks!

1 answers

Check out pdfminer; it's not exactly a user-friendly API, but you should be able to navigate the PDF structure and drop the obstructing text. You can come back with specific questions.

But if it's just a question of hiding the OCR, you may be able to hide it if you open the file in Acrobat; IIRC it has options for showing just the OCR, just the background, or both.

Move text from one file to another using specific positions python

Python - Move line from one file to another

python remove file from one directory to another

Photoshop: copy a layer from one file to another with python

How to move files listed in a text file from one folder to another in python

Move files from one directory to another if they appear in a text file in Python (take 2)

How to import module from one python file to another python file?

Transferring text from one file to another - Python

Python - Copying from one text file to another

how to move from one polygon to another in file in python?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Move text from one file to another using specific positions python Python - Move line from one file to another python remove file from one directory to another Photoshop: copy a layer from one file to another with python How to move files listed in a text file from one folder to another in python Move files from one directory to another if they appear in a text file in Python (take 2) How to import module from one python file to another python file? Transferring text from one file to another - Python Python - Copying from one text file to another how to move from one polygon to another in file in python?

Related Tags

Python module that can remove the OCRed text layer from one pdf file and move it to another?

Question

1 answers

solution1 1 2014-09-29 23:28:35

solution1
1 2014-09-29 23:28:35