简体   繁体   中英

how to fix problem of boxes generated by pdf2image while converting pdf to images.?

I'm trying to convert pdf to images using pdf2image but getting problem of extra generated boxes. This is my input pdf file screenshot

this in input file

from pdf2image import convert_from_path

images = convert_from_path('input_pdf.pdf',output_folder=r'C:\Users\Baith')

images[0].save('output.jpg')

after executing above code got this output output_file

Since pdf2image is only a thin wrapper around pdftoppm , itself part of poppler, I would advise trying different parameters with the CLI tools to see it a specific combination works.

As for pdf2image itself, you might want to try use_cropbox=True and see if it still add lines.

Feel free to open an issue directly of the repository, if you can provide a sample PDF I would be happy to assist with the issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM