简体   繁体   English

生成pdf文档时如何解决java.lang.OutOfMemoryError:Java堆空间?

[英]How to fix java.lang.OutOfMemoryError: Java heap space when generating pdf document?

My system throws exception: "java.lang.OutOfMemoryError: Java heap space", when it processed a huge file. 我的系统在处理一个巨大的文件时抛出异常:“ java.lang.OutOfMemoryError:Java堆空间”。 I realized that StringWriter.toString() cause the double size in heap, so it could cause the issue. 我意识到StringWriter.toString()会导致堆中的大小增加一倍,因此可能会导致此问题。 How can I optimize block of following code to avoid Out Of Memory. 我如何优化下面的代码块以避免内存不足。

public byte[] generateFromFo(final StringWriter foString) {
        try {
            StringReader foReader = new StringReader(foString.toString());
            ByteArrayOutputStream pdfWriter = new ByteArrayOutputStream();
            Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, fopFactory.newFOUserAgent(),pdfWriter);
            TRANSFORMER_FACTORY.newTransformer().transform(new StreamSource(foReader), new SAXResult(fop.getDefaultHandler()));
            LOG.debug("Completed rendering PDF output!");
            return pdfWriter.toByteArray();
        } catch (Exception e) {
            LOG.error("Error while generating PDF from FO",e);
            throw new AuditReportExportServiceException(AuditErrorCode.INTERNAL_ERROR,"Could not generate PDF from XSL-FO");
        }
    }

Using an InputStream of bytes may reduce the memory for foString by upto a factor 2 (char = 2 bytes). 使用字节的InputStream可能会将foString的内存减少多达2倍(char = 2个字节)。

A ByteArrayOutputStream resizes during its filling, so adding an estimated need speeds things up, and might prevent a resizing too much. ByteArrayOutputStream在填充期间会调整大小,因此添加估计的需求会加快处理速度,并且可能会导致调整大小过多。

        InputStream foReader = new ByteArrayInputStream(
                foString.toString().getBytes(StandardCharsets.UTF_8);
        foString.close();
        final int initialCapacity = 160 * 1024;
        ByteArrayOutputStream pdfWriter = new ByteArrayOutputStream(initialCapacity);
        Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, fopFactory.newFOUserAgent(),
            pdfWriter);
        TRANSFORMER_FACTORY.newTransformer().transform(new StreamSource(foReader),
            new SAXResult(fop.getDefaultHandler()));

The best would be to change the API: 最好的方法是更改​​API:

public void generateFromFo(final String foString, OutputStream pdfOut) { ... }

This might make the ByteArrayOutputStream superfluous, and you might immediately stream to a file, URL, or whatever. 这可能会使ByteArrayOutputStream变得多余,并且您可能会立即流式传输到文件,URL或其他内容。

The document itself and the generated PDF also has issues: 文档本身和生成的PDF也存在以下问题:

  • image sizes (but remember the higher resolution of prints) 图像尺寸(但请记住打印分辨率更高)
  • some images can be nicely vectorized 一些图像可以很好地矢量化
  • repeated images like in a page header, should be stored once 重复的图像(如页面标题中的图像)应存储一次
  • fonts should ideally be the standard fonts, second best embedded subsets (of used chars) 理想情况下,字体应该是标准字体,是(使用的字符)第二好的嵌入式子集
  • XML might be suboptimal, very repetitive XML可能不是最佳选择,非常重复

Broadly, you have two main options: 大致来说,您有两个主要选择:

  1. Increase the memory available to your process. 增加您的进程可用的内存。 The -Xmx option to Java will set this config. Java的-Xmx选项将设置此配置。 You could pass eg -Xmx8G to ask for 8GB of memory on a 64 bit system, if you have that much. 您可以通过-Xmx8G来请求64位系统上的8GB内存(如果有的话)。 Docs are here: http://docs.oracle.com/javase/7/docs/technotes/tools/windows/java.html#nonstandard 文档在这里: http : //docs.oracle.com/javase/7/docs/technotes/tools/windows/java.html#nonstandard

  2. Change your code to "stream" the data through in smaller chunks, rather than trying to assemble the whole file into a byte[] in memory, as you have done here. 更改代码以通过较小的块“流式处理”数据,而不是像在此所做的那样尝试将整个文件组装到内存中的byte[]中。 You could change the output of your transformer to a FileOutputStream rather than a ByteArrayOutputStream and return a File rather than a byte[] in the code shown? 您可以在显示的代码中将转换器的输出更改为FileOutputStream而不是ByteArrayOutputStream并返回File而不是byte[]吗? Or, depending on what you do with the output of this method, you could return an InputStream and allow the consumer to receive the file data in a streaming fashion? 或者,取决于您对该方法的输出执行的操作,是否可以返回InputStream并允许使用者以流方式接收文件数据?

    You may also need to change things so that the input to this method is consumed in a streaming fashion. 您可能还需要进行更改,以便以流方式使用此方法的输入。 How to do that depends on the details of how StringWriter foString was created. 如何执行取决于如何创建StringWriter foString的详细信息。 You may need to "pipe" an OutputStream into an InputStream to make this work, see https://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html 您可能需要将OutputStream “管道”到InputStream中才能完成此工作,请参见https://docs.oracle.com/javase/7/docs/api/java/io/PipedInputStream.html

1 is simpler. 1比较简单。 2 is probably better here. 这里2可能更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM