简体   繁体   中英

Converting PDF Pages to JPG on Java-GAE

I am searching for a open-source java-library that enables me to render single pages of PDFs as JPG or PNG on server-side.

Unfortunately it mustn't use any other java.awt.* classes then

  • java.awt.datatransfer.DataFlavor
  • java.awt.datatransfer.MimeType
  • java.awt.datatransfer.Transferable

If there is any way, a little code-snippet would be fantastic.

i believe icepdf might have what you are looking for.

I've used this open source project a while back to turn uploaded pdfs into images for use in an online catalog.

import org.icepdf.core.exceptions.PDFException;
import org.icepdf.core.exceptions.PDFSecurityException;
import org.icepdf.core.pobjects.Document;
import org.icepdf.core.pobjects.Page;
import org.icepdf.core.util.GraphicsRenderingHints;


public byte[][] convert(byte[] pdf, String format) {

    Document document = new Document();
    try {
        document.setByteArray(pdf, 0, pdf.length, null);

    } catch (PDFException ex) {
        System.out.println("Error parsing PDF document " + ex);
    } catch (PDFSecurityException ex) {
        System.out.println("Error encryption not supported " + ex);
    } catch (FileNotFoundException ex) {
        System.out.println("Error file not found " + ex);
    } catch (IOException ex) {
        System.out.println("Error handling PDF document " + ex);
    }
    byte[][] imageArray = new byte[document.getNumberOfPages()][];
    // save page captures to bytearray.
    float scale = 1.75f;
    float rotation = 0f;

    // Paint each pages content to an image and write the image to file
    for (int i = 0; i < document.getNumberOfPages(); i++) {
        BufferedImage image = (BufferedImage)
                document.getPageImage(i,
                                      GraphicsRenderingHints.SCREEN,
                                      Page.BOUNDARY_CROPBOX, rotation, scale);
       try {
            //get the picture util object
            PictureUtilLocal pum = (PictureUtilLocal) Component
            .getInstance("pictureUtil");
            //load image into util
            pum.loadBuffered(image);

            //write image in desired format
            imageArray[i] = pum.imageToByteArray(format, 1f);

            System.out.println("\t capturing page " + i);

        } catch (IOException e) {
            e.printStackTrace();
        }
        image.flush();
    }
    // clean up resources
    document.dispose();
    return imageArray;
}

Word of caution though, I have had trouble with this library throwing a SegFault on open-jdk. worked fine on Sun's. Not sure what it would do on GAE. I can't remember what version it was that had the problem so just be aware.

You can apache PDF box APi for this purpose and use following to code to convert two pdfs into JPG page by page .

public  void convertPDFToJPG(String src,String FolderPath){

           try{
               File folder1 = new File(FolderPath+"\\");
               comparePDF cmp=new comparePDF();
               cmp.rmdir(folder1);

           //load pdf file in the document object
           PDDocument doc=PDDocument.load(new FileInputStream(src));
           //Get all pages from document and store them in a list
           List<PDPage> pages=doc.getDocumentCatalog().getAllPages();
           //create iterator object so it is easy to access each page from the list
           Iterator<PDPage> i= pages.iterator();
           int count=1; //count variable used to separate each image file
           //Convert every page of the pdf document to a unique image file
           System.out.println("Please wait...");
           while(i.hasNext()){
            PDPage page=i.next(); 
            BufferedImage bi=page.convertToImage();
            ImageIO.write(bi, "jpg", new File(FolderPath+"\\Page"+count+".jpg"));
            count++;
            }
           System.out.println("Conversion complete");
           }catch(IOException ie){ie.printStackTrace();}
          }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM