简体   繁体   English

使用ImageMagick和/或GhostScript将多页PDF转换为多个JPG

[英]Converting multi-page PDFs to several JPGs using ImageMagick and/or GhostScript

I am trying to convert a multi-page PDF file into a bunch of JPEGs, one for each page in the PDF. 我正在尝试将多页PDF文件转换为一堆JPEG,一个用于PDF中的每个页面。 I have spent hours and hours looking up how to do this, and eventually I discovered that I need Ghostscript installed. 我花了几个小时看着如何做到这一点,最后我发现我需要安装Ghostscript。 So I did that (from this website: http://downloads.ghostscript.com/public/ And I used the most recent link "ghostscript-9.05.tar.gz" from Feb 8, 2012). 所以我这样做了(来自这个网站: http//downloads.ghostscript.com/public/我使用了2012年2月8日的最新链接“ghostscript-9.05.tar.gz”)。

However, even with this installed/downloaded, I am still unable to do what I want. 但是,即使已安装/下载,我仍然无法做我想要的。 Should I have this saved somewhere special, like in the same folder as ImageMagick? 我应该将此保存在特殊的地方,例如与ImageMagick在同一文件夹中吗?

What I have figured out so far is this: 我到目前为止所知道的是:

  • In Command Prompt I change the working directory to the ImageMagick folder, where that is saved. 在命令提示符中,我将工作目录更改为保存它的ImageMagick文件夹。

  • I then type 然后我输入

     convert "<full file path to pdf>" "<full file path to jpg>" 

This is followed by a giant blob of error. 接下来是一大堆错误。 It begins with: 它始于:

    Unrecoverable error: rangecheck in.setuserparams
    Operand stack:

Followed by a blurb of unreadable numbers and caps. 接着是一些难以理解的数字和大写字母。 It ends with: 它结束于:

    While reading gs_lev2.ps:
    %%[ Error: invalidaccess; OffendingCommand: put ]%%

Needless to say, after hours and hours of deliberation, I don't think I am any closer to doing the seemingly simple task of converting this PDF into a JPG. 毋庸置疑,经过数小时和数小时的审议后,我认为我没有更接近于将这个PDF转换为JPG的看似简单的任务。

What I would like are some step by step instructions on how to make this work. 我想要的是如何使这项工作一步一步的说明。 Don't leave out anything, no matter how "obvious" it might seem (especially anything involving ghostscript). 不要遗漏任何东西,无论它看起来多么“明显”(尤其是涉及ghostscript的任何东西)。 This has been troubling me and my supervisor for months now. 几个月以来,这一直困扰着我和我的主管。

For further clarification, we are on a Windows XP operating system. 为了进一步说明,我们使用的是Windows XP操作系统。 The eventual intention is to call these command lines in R, the statistical language, and run it in a script. 最终的目的是在R(统计语言)中调用这些命令行,并在脚本中运行它。 In addition, I have been able to successfully convert JPGs to PNG format and vice versa, but PDF just is not working. 此外,我已经能够成功地将JPG转换为PNG格式,反之亦然,但PDF只是不起作用。

Help!!! 救命!!!

You don't need ImageMagick for this, Ghostscript can do it all alone. 你不需要ImageMagick,Ghostscript可以独自完成。 (If you used ImageMagick, it couldn't do that conversion itself, it HAS to use Ghostscript as its 'delegate' .) (如果你使用ImageMagick的,它不能做转换本身,它必须使用的Ghostscript作为其“代理”。)

Try this for directly using Ghostscript: 试试这个直接使用Ghostscript:

 c:\path\to\gswin32c.exe ^
   -o page_%03d.jpg ^
   -sDEVICE=jpeg ^
    d:/path/to/input.pdf

This will create a new JPEG for each page, and the filenames will increment as page_001.jpg , page_002.jpg ,... 这将为每个页面创建一个新的JPEG,文件名将增加为page_001.jpgpage_002.jpg ,...

Note, this will also create JPEGs which use all the default settings of the jpeg device (one of the most important ones will be that the resolution will be 72dpi). 注意,这也将创建使用jpeg设备的所有默认设置的jpeg (其中一个最重要的设置是分辨率为72dpi)。

If you need higher (or lower resolution) for your images, you can add other options: 如果您的图像需要更高(或更低分辨率),您可以添加其他选项:

 gswin32c.exe ^
   -o page_%03d.jpg ^
   -sDEVICE=jpeg ^
   -r300 ^
   -dJPEGQ=100 ^
    d:/path/to/input.pdf

-r300 sets the resolution to 300dpi and -dJPEGQ=100 sets the highest JPEG quality level (Ghostscript's default is 75). -r300将分辨率设置为300dpi, -dJPEGQ=100设置最高JPEG质量级别(Ghostscript的默认值为75)。

Also note, please: JPEG is not well suited to represent shapes with sharp edges and high contrast in good quality (such as you typically see in black-on-white text pages with small characters). 另请注意: JPEG不适合用于表现具有锐边和高对比度且高质量的形状(例如,您通常会在带有小字符的黑白文本页面中看到)。

The (lossy) JPEG compression method is optimized for continuous-tone pictures + photos, and not for line graphics. (有损)JPEG压缩方法针对连续色调图片+照片进行了优化 ,而不是针对线条图形进行了优化。 Therefore it is sub-optimal for such PostScript or PDF input pages which mainly contain text. 因此,对于主要包含文本的PostScript或PDF输入页面,它是次优的。 Here, the lossy compression of the JPEG format will result in poorer quality output even if the input is excellent. 这里,即使输入非常好,JPEG格式的有损压缩也会导致较差的输出质量。 See also the JPEG FAQ for more details on this topic. 有关此主题的更多详细信息,另请参阅JPEG FAQ

You may get better image output by choosing PNG as the output format (PNG uses a lossless compression): 通过选择PNG作为输出格式(PNG使用无损压缩),您可以获得更好的图像输出:

 gswin32c.exe ^
   -o page_%03d.png ^
   -sDEVICE=png16m ^
   -r150 ^
    d:/path/to/input.pdf

The png16m device produces 24bit RGB color. png16m器件产生24位RGB颜色。 You could swap this for pnggray (for pure grayscale output), png256 (for 8-bit color), png16 (4-bit color), pngmono (black and white only) or pngmonod (alternative black-and-white module). 您可以将其换成pnggray (用于纯灰度输出), png256 (用于8位颜色), png16 (4位颜色), pngmono (仅限黑白)或pngmonod (替代黑白模块)。

There are numerous SaaS services that will do this for you too. 有许多SaaS服务也会为您做到这一点。 HyPDF and Blitline come to mind. 想到了HyPDF和Blitline。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM