简体   繁体   中英

PDFClown image extraction images inverted

I'm working with PDFClown and I'm trying to extract images from a pdf file. I use the example code provided by the source code that can be found at http://pdfclown.org .

ImageExtractionSample.java.

The problem is the images are negative and flipped horizontally. Does anyone know how to resolve this problem?

Check with other PDF files to see if other PDF files are also giving the rotated or flipped images. ImageExtractionSample.java is not checking rotation or matrix defined transformations for the image object but just writes the content to a file as is (so it will work for JPG images but not for CCIT encoded images for example).

So there are things to consider when you extract image from PDF:

  • image can be rotated using the attached transformation matrix (CTM);
  • image can be rotated/transformed as part of the form which is transformed;
  • image can be placed without transformation on a page but the page itself is rotated;
  • image may contain the overlaid Mask on top of it (and the Mask can be rotated and transformed);
  • JPG image is stored pretty much as is but there are other formats supported by PDF like CCIT compression, LZW compressed images etc;

But the general suggestion is that when you extract JPG image from PDF using PDFClown you should just flip and rotate extracted images like suggested on the SourceForge project discussion page .

if you could point to the particular PDF sample file then it would be easier to suggest the solution.

If you're on Windows then you may use this free PDF Multitool utility to compare non-transformed and transformed images from PDF using "Extract raw images (without transformation)" option in images extraction dialog.

Disclaimer: I work for ByteScout, the PDF Multitool utility is free for both commercial and non-commercial purposes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM