简体   繁体   中英

PDFsharp Compression for Images

I'm building a tool which will join multiple PDFs into a single file.

Each source PDF is one single page containing an image. The images are high resolution greyscale, and generally quite large.

I need to find the most optimal compression to apply to these images such that the resultant file is smaller than it is now (with around 240 source PDFs, the final file comes out at over 650 MB).

My question is, would it be possible to extract the images from the source PDFs, convert them to grey scale TIFFs and then compile a new PDF using them as sources? It is my hope that this approach would make use of the inbuilt LZ compression, rather than the JPEG process of just copying the image into the PDF byte by byte.

The images themselves are high resolution and large size, so even scaling them would make a difference (I will be testing this today - these images will be sent for printing into a portfolio book, so a higher resolution is preferable, however as the book will be A5, they don't need to be enormous).

I would be grateful for any suggestions of a better implementation, although I'm stuck with using these one page PDFs as my sources - there are simply too many images to start from scratch using the original sources, so extracting them from the source PDFs is my only real option.

PDFsharp does not reduce your images in any way (that may come in the future, but currently images are not modified).

JPEG is a very efficient compression method. Image quality is reduced a bit, but file size shrinks drastically. Not suitable for line art, but very good for photos.
PDFsharp will optionally apply LZ compression to JPEG images, but that usually gains 1 % through 5 % only.

It's up to you to scale the images down. If you go for JPEG you have to decide which JPEG quality you need - lower quality gives smaller images.

You can try using TIFF or PNG images. PDFsharp will also apply LZ compression when embedding them in PDF, but in most cases JPEG will achieve much better compression results.

Without seeing a real PDF with a real image I can only provide a general answer about how PDFsharp handles images.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM