简体   繁体   中英

ImageMagick Split PDF Output File Name Always Starts at Zero

I run the following command to split a PDF in ImageMagick:

convert file.pdf[5-10] file.png

The resulting output files are always suffixed starting with zero. That is:

file-0.png, file-1.png, file-2.png...

Any ideas what I might be doing wrong? The documentation states that the files should be suffixed starting at 5, matching the page numbers of the pages extracted.

I ended up solving this by using the -scene # command line parameter.

This causes the output to begin at the desired index. For posterity:

convert file.pdf -scene 5 file-%d.png

You see the result you describe because ImageMagick's page count for multi-page image formats is zero-based : Page 1 will have index 0 , page 2 will have index 1 , etc.

Also, ImageMagick cannot process PDF input files itself: it employs Ghostscript as its 'delegate' -- Ghostscript consumes the PDF first and emits a raster file for each PDF page. Only these raster files are then processed by ImageMagick.

Depending on your exact ImageMagick version and IM setup, this may result in an indirect PNG output generation, and the conversion chain may look like this:

PDF --> PPM (portable pixmap) --> PNG
     ^                         ^
     |                         |
     |                         +-- (handled by ImageMagick)
     +-- (handled by Ghostscript)

If you are unlucky, the result will be slow and the quality may not be as good as it could be.

To verify what exactly happens in a convert a.pdf a.png command, you can add the -verbose parameter. That will show you the Ghostscript command being employed by IM to process the PDF input:

convert -verbose a.pdf a.png

 /var/tmp/magick-15951W3TZ3WRpwIUk1 PNG 612x792 612x792+0+0 8-bit sRGB 3.73KB 0.000u 0:00.000
 a.pdf PDF 612x792 612x792+0+0 16-bit sRGB 3.73KB 0.000u 0:00.000
 a.pdf=>a.png PDF 612x792 612x792+0+0 8-bit sRGB 2c 2.95KB 0.000u 0:00.000

 [ghostscript library] -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT \
   -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=pngalpha" \
   -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72" \
  "-sOutputFile=/var/tmp/magick-15951W3TZ3WRpwIUk%d" \
  "-f/var/tmp/magick-15951nJD8-fF8kA7j" \
  "-f/var/tmp/magick-15951JTZDMwtEswHn"

(As you can see, my IM installation is set up to do a PDF->PNG conversion without the detour via PPM... Your mileage may vary.)

You may get better results when using Ghostscript directly, instead of running an IM convert command. (If ImageMagick works at all with PDF->PNG conversion, you have a working Ghostscript installation for sure.) So you can try this:

gs                  \
 -o file-%03d.png   \
 -sDEVICE=pngalpha  \
  file.pdf

The -%03d file name suffix will cause Ghostscript to output file-001.png , file-002.png , file-003.png .

However, if you are unlucky and have an older version of Ghostscript installed, the file name will also start with a file-000 one...

In any case, since your sample command seems to suggest that you want to convert only a page range (5--10) from the PDF file (not all pages), here is the command to use:

gs                  \
 -o file-%03d.png   \
 -sDEVICE=pngalpha  \
 -dFirstPage=5      \
 -dLastPage=10      \
  file.pdf

But the bad news here is: Ghostscript will STILL start with naming the output files as file-001.png (page 5) ... file-005.png (page 10).

To work around that, you'll have to generated the PNGs for the first 4 pages too, and later delete them again:

gs                  \
 -o file-%03d.png   \
 -sDEVICE=pngalpha  \
 -dFirstPage=1      \
 -dLastPage=10      \
  file.pdf

rm -rf file-00{1,2,3,4}.png

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM