简体   繁体   English

FOP-如何避免大量的页面序列占用大量内存?

[英]FOP - how to avoid high memory consumption with very high number of page-sequences?

How can I avoid FOP to consume a growing amount of memory even when pages do not contain forward-references and < page-sequence> blocks are very small? 即使页面不包含前向引用并且<page-sequence>块很小,如何避免FOP占用越来越多的内存?

Here's a Test java program that feeds FOP with a hand made FO which just repeats over and over the same very basic page-sequence: 这是一个Test Java程序,它用手工制作的FO向FOP供料,该FO在相同的非常基本的页面序列上重复一遍:

Fo2Pdf.java Fo2Pdf.java

import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.io.PipedInputStream;

import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.sax.SAXResult;
import javax.xml.transform.stream.StreamSource;

import org.apache.fop.apps.FOUserAgent;
import org.apache.fop.apps.Fop;
import org.apache.fop.apps.FopFactory;
import org.apache.fop.apps.MimeConstants;
import org.xml.sax.helpers.DefaultHandler;

public class Fo2Pdf implements Runnable {

private PipedInputStream in;

public Fo2Pdf(PipedInputStream in)  {
    this.in = in;
}


@Override
public void run() {
    // instantiate Fop factory
    FopFactory fopFactory = FopFactory.newInstance();
    fopFactory.setStrictValidation(false);

    // Setup output
    OutputStream out = null;
    try {
        out = new FileOutputStream("output.pdf");
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    }

    try {
        // Setup user agent
        FOUserAgent userAgent = fopFactory.newFOUserAgent();
        userAgent.setConserveMemoryPolicy(true);

        Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, userAgent, out);

        // Setup JAXP using identity transformer
        TransformerFactory factory = TransformerFactory.newInstance();
        Transformer transformer = factory.newTransformer(); 

        // Setup input stream
        Source src = new StreamSource(in);

        // Resulting SAX events (the generated FO) must be piped through to FOP
        DefaultHandler defaultHandler = (DefaultHandler) fop.getDefaultHandler();
        Result res = new SAXResult(defaultHandler);

        // Start FOP processing
        transformer.transform(src, res);

    } catch (Exception e) {
        e.printStackTrace();
    }
    }
}

FeedFo.java FeedFo.java

import java.io.IOException;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;


public class FeedFo {


public static void main(String args[]) throws IOException, InterruptedException {

    // instantiate and connect the pipes
    PipedInputStream in = new PipedInputStream();
    PipedOutputStream out = new PipedOutputStream(in);

    // Fo2Pdf - instantiate and start consuming the stream
    Fo2Pdf fo2Pdf = new Fo2Pdf(in);
    Thread fo2PdfThread = new Thread(fo2Pdf, "Fo2Pdf");
    fo2PdfThread.start();

    /*
     <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
        <fo:layout-master-set>
            <fo:simple-page-master master-name="A4" page-width="210mm" page-height="297mm">
                <fo:region-body/>
            </fo:simple-page-master>
        </fo:layout-master-set>

     */
    out.write(("<fo:root xmlns:fo=\"http://www.w3.org/1999/XSL/Format\"><fo:layout-master-set>" +
            "<fo:simple-page-master master-name=\"A4\" page-width=\"210mm\" page-height=\"297mm\">" +
            "<fo:region-body/></fo:simple-page-master></fo:layout-master-set>").getBytes());


    for(int i=0; i<100000000; i++) {

        // sleep 3 seconds every 50000 page-sequences to make sure the consumer is faster than the producer
        if(i % 50000 == 0) {
            Thread.currentThread().sleep(3000);
        }

        /*
         <fo:page-sequence xmlns:fo="http://www.w3.org/1999/XSL/Format" master-reference="A4">
            <fo:flow flow-name="xsl-region-body">
                <fo:block/>
            </fo:flow>
        </fo:page-sequence>
         */
        out.write(("<fo:page-sequence xmlns:fo=\"http://www.w3.org/1999/XSL/Format\" master-reference=\"A4\"><fo:flow flow-name=\"xsl-region-body\"><fo:block/></fo:flow></fo:page-sequence>").getBytes());
    }

    out.write("</fo:root>".getBytes());
    out.flush();
    out.close();

    fo2PdfThread.join();

    System.out.println("Exit");
}
}

As you notice, FOP writes to disk the PDF as soon as a page-sequence has been closed. 如您所见,页面序列关闭后,FOP会将PDF写入磁盘。 This means that pages are (should?) not being kept into memory. 这意味着(不应该)将页面保存在内存中。 But, memory just keeps growing and growing. 但是,内存只是在不断增长。 With a 256MB heap size, generation stops at about 150000 page-sequences. 堆大小为256MB时,生成停止在大约150000页序列中。

Why is this happening? 为什么会这样呢?

I suspect that, despite your sleep call, your producer is working much faster than your consumer and your piped stream is filling up your memory. 我怀疑,尽管您sleep电话,生产者的工作速度仍比消费者快得多,并且管道传输流正在占用您的内存。 Two ways I can think of to fix this: 我可以通过两种方式解决此问题:

Option 1 is to use a BlockingQueue instead of a piped stream. 选项1是使用BlockingQueue代替管道流。

Option 2 is to add a public boolean pipeIsFull() method to Fo2Pdf that returns true if in.available() exceeds, I dunno, 2mb. 选项2是向public boolean pipeIsFull()添加一个public boolean pipeIsFull()方法,如果in.available()超过Fo2Pdf ,则返回true。 Then your main for loop will sleep for 500ms or whatever if pipeIsFull() is true. 然后,您的main for循环将休眠500毫秒,或者如果pipeIsFull()为true,则将休眠。

Also, a way to reduce your memory consumption is 另外,减少内存消耗的一种方法是

byte[] bytes = ("<fo:page-sequence xmlns:fo=\"http://www.w3.org/1999/XSL/Format\" master-reference=\"A4\"><fo:flow flow-name=\"xsl-region-body\"><fo:block/></fo:flow></fo:page-sequence>").getBytes();
for(int i=0; i<100000000; i++) {
    ...
    out.write(bytes);
}

I don't know how significant of an impact this will have (it'll reduce it by a couple gb, but that's probably peanuts compared to what Fo2Pdf is using), but it can't hurt. 我不知道这将产生多大的影响(它将减少几GB,但与Fo2Pdf所使用的相比,这可能是花生),但不会造成伤害。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM