I am trying to convert my pdf
files to jpg
. I first use pdf2image to save the file as a .ppm
. Then I want to use PIL to convert the .ppm
to .jpg
.
How do I find the name of the file that pdf2image saved?
Here is my code:
def to_jpg(just_ids):
for just_id in just_ids:
image = convert_from_path('/Users/davidtannenbaum/Desktop/scraped/{}.pdf'.format(just_id), output_folder='/Users/davidtannenbaum/Desktop/scraped/')
file_name = ?
im = Image.open("/Users/davidtannenbaum/Desktop/scraped/{}.ppm".format(file_name))
im.save("/Users/davidtannenbaum/Desktop/scraped/{}.jpg".format(just_id))
You don't need to, the image
variable should contain a list of Image
objects. You can simply do:
for i, im in enumerate(image):
im.save("/Users/davidtannenbaum/Desktop/scraped/{}_{}.jpg".format(just_id, i)))
The convert_to_path()
method has a few more parameters you can use. You can set the paths_only
parameter to True
and the format attribute fmt
to "jpeg"
.
This will directly save your images to your output folder in JPG format instead of PPM and the image
variable will contain the relative paths to each image instead of the image objects.
for just_id in just_ids:
image = convert_from_path('/Users/davidtannenbaum/Desktop/scraped/{}.pdf'.format(just_id), output_folder='/Users/davidtannenbaum/Desktop/scraped/', fmt="jpeg", paths_only=True)
pdf_path = '/path/to/pdf_images/'
output_folder = '/path/for/output/images/'
for pdf in os.listdir(pdf_path):
filename = pdf.split('.')[0] # prepare your filename
pdfs = convert_from_path(os.path.join(pdf_path,pdf),output_folder=output_folder, output_file=os.path.join(output_folder+ filename), fmt="jpeg")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.