简体   繁体   中英

Appropriate Java Heap Size

In an attempt to merge multiple PDF documents, I am experiencing the following error....

    PDFMerger failed with the following exception:
org.apache.pdfbox.exceptions.WrappedIOException
    at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:278)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1220)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1187)
    at org.apache.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:237)
    at org.apache.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:194)
    at org.apache.pdfbox.PDFMerger.merge(PDFMerger.java:82)
    at org.apache.pdfbox.PDFMerger.main(PDFMerger.java:44)
    at org.apache.pdfbox.PDFBox.main(PDFBox.java:83)

Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.pdfbox.io.RandomAccessBuffer.clone(RandomAccessBuffer.java:69)
    at org.apache.pdfbox.cos.COSStream.clone(COSStream.java:78)
    at org.apache.pdfbox.cos.COSStream.<init>(COSStream.java:102)
    at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:409)
    at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:650)
    at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:203)
    ... 7 more

I think the obvious solution is to increase the heap space (but I am not sure). The Routine works on 20-30 files, but with close to 100 it throws the exception.

The environment is an apache2 web server with Java 1.8.0 and I'm calling the command through PHP exec()...

    // Build a string for PDF box...
    $mergepdf = "java -jar pdfbox-app-1.8.9.jar PDFMerger ";

    foreach ($drawings as $key => $id){
        $mergepdf .= $path.$userid."-".$key.".pdf ";
    }

    $mergepdf .= $path.$pdffilename;

    // Make the compiled pdf
    exec($mergepdf);

A user can request how many pdfs he wants to download. The intent is to merge them and offer a compiled PDF. Because the number and sizes of the PDFs are unkown at the time of programming (in a worse case scenerio, the count could be over 1,000 with each sized between 2M and 30M).

What is a safe limit to set the heap size to, or, how do I determine what an appropriate heap size is to run my routine - And what imapacts can I expect this to have on the web server while executing? Is there an issue with cranking it to the max?

I am using a T2.micro instance on Ec2.

The maximum heap size is the point at which you would rather the program fail instead of continue to use more memory. This is usually determined by the size of machine you have ie you might set the maximum to be 80% of the memory of the machine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM