Issue in Removing Header and Footer in PDF using iText PDF

Question

I am using itext-xtra-5.5.6 api to remove/cleanup the header and footer.

Here is the code

//removes header and footer based on the configuration
public static void cleanUpContent(String inPDFFile,String targetPDFFile,PDFConfig pdfConfig) throws Exception{
    PdfReader reader = new PdfReader(inPDFFile);
    OutputStream outputStream = new FileOutputStream(targetPDFFile);
    float upperY=pdfConfig.getPdfUpperY();
    float lowerY=pdfConfig.getPdfLowerY();
    boolean highLightColor=pdfConfig.isPdfHighLightClippedTextColor();
    PdfStamper stamper = new PdfStamper(reader, outputStream);
    List<PdfCleanUpLocation> cleanUpLocations = new ArrayList<PdfCleanUpLocation>();

    for (int i = 1; i <= reader.getNumberOfPages(); i++) {
        Rectangle pageRect = reader.getCropBox(i);  
        Rectangle headerRect= new Rectangle(pageRect);
        headerRect.setBottom(headerRect.getTop()-upperY);               
        Rectangle footerRect= new Rectangle(pageRect);
        footerRect.setTop(footerRect.getBottom()+lowerY);   

        if(highLightColor){
            cleanUpLocations.add(new PdfCleanUpLocation(i, headerRect,BaseColor.GREEN));
            cleanUpLocations.add(new PdfCleanUpLocation(i, footerRect,BaseColor.GREEN));
        }else{
            cleanUpLocations.add(new PdfCleanUpLocation(i, headerRect));
            cleanUpLocations.add(new PdfCleanUpLocation(i, footerRect));
        }
    }   
    PdfCleanUpProcessor cleaner = new PdfCleanUpProcessor(cleanUpLocations, stamper);
    try{
        cleaner.cleanUp();
    }catch(Exception e){
         e.printStackTrace();
    }

    stamper.close();
    reader.close();
    outputStream.flush();
    outputStream.close();
}

When I run the code to remove header and footer for a PDF file with 1440 pages with upperY=65 and lowerY=65 then the code is deleting all the content from the page but when upperY=65 and lowerY=45 then code is deleting just the header and footer which is expected.

Also another issue is Null pointer exception for some pages in the DefaultClipper class

private void fixupFirstLefts2( OutRec OldOutRec, OutRec NewOutRec ) {
    for (final OutRec outRec : polyOuts) {
        if (outRec.firstLeft.equals( OldOutRec )) {
            outRec.firstLeft = NewOutRec;
        }
    }
}

in polyOuts -> outRec.firstLeft is null so outRec.firstLeft.equals method throws the Null pointer exception.

Exception stack trace

java.lang.NullPointerException
    at com.itextpdf.text.pdf.parser.clipper.DefaultClipper.fixupFirstLefts2(DefaultClipper.java:1463)
    at com.itextpdf.text.pdf.parser.clipper.DefaultClipper.joinCommonEdges(DefaultClipper.java:2121)
    at com.itextpdf.text.pdf.parser.clipper.DefaultClipper.executeInternal(DefaultClipper.java:1420)
    at com.itextpdf.text.pdf.parser.clipper.DefaultClipper.execute(DefaultClipper.java:1362)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpRegionFilter.filterFillPath(PdfCleanUpRegionFilter.java:174)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpRenderListener.filterCurrentPath(PdfCleanUpRenderListener.java:402)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpRenderListener.renderPath(PdfCleanUpRenderListener.java:232)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.paintPath(PdfContentStreamProcessor.java:377)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.access$6300(PdfContentStreamProcessor.java:60)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor$PaintPath.invoke(PdfContentStreamProcessor.java:1183)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpContentOperator.invoke(PdfCleanUpContentOperator.java:138)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.invokeOperator(PdfContentStreamProcessor.java:286)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.processContent(PdfContentStreamProcessor.java:429)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor$FormXObjectDoHandler.handleXObject(PdfContentStreamProcessor.java:1252)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.displayXObject(PdfContentStreamProcessor.java:352)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.access$6100(PdfContentStreamProcessor.java:60)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor$Do.invoke(PdfContentStreamProcessor.java:988)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpContentOperator.invoke(PdfCleanUpContentOperator.java:138)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.invokeOperator(PdfContentStreamProcessor.java:286)
    at com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.processContent(PdfContentStreamProcessor.java:429)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpProcessor.cleanUpPage(PdfCleanUpProcessor.java:160)
    at com.itextpdf.text.pdf.pdfcleanup.PdfCleanUpProcessor.cleanUp(PdfCleanUpProcessor.java:135)

not sure where i am doing mistake. I even tried to see if the pdf pages contain images or other types but the pages are just text based. Please help resolve 2 issues.

Answer 1

Concerning the observations by the OP , an exception like the one presented by the OP indeed is thrown when running the OP's code with his sample file and iText and iText-xtra 5.5.6. Furthermore, the page on which this happens is empty in the result PDF.

The cause for the exception indeed is some bug, and the cause for the empty page is that the cleanup code for each processed page first removes the former content and then starts building the new content; if an exception occurs early while processing the page as in the case at hand, the result can be an empty page.

Meanwhile, though, the bug has been fixed , in a current 5.5.7 development snapshot the exception does not occur anymore.

A different unwanted effect occurs, though: the OP's sample document contains some rotated pages, eg page 18:

Applying the code as is to it, one gets:

The reason for this is that the PdfStamper usually tries to treat rotated portrait pages as if they were true landscape pages.As the PdfCleanUpProcessor tries is rotation-unaware, this results in mayhem

One can tell it not to do so, though, using the setRotateContents setter:

    ...
    PdfStamper stamper = new PdfStamper(reader, outputStream);
    stamper.setRotateContents(false);
    List<PdfCleanUpLocation> cleanUpLocations = new ArrayList<PdfCleanUpLocation>();
    ...

This updated code now produces:

Issue in Removing Header and Footer in PDF using iText PDF

Question

1 answers

solution1
1 2015-09-10 13:04:13

Issue in Removing Header and Footer in PDF using iText PDF

Question

1 answers

solution1 1 2015-09-10 13:04:13

solution1
1 2015-09-10 13:04:13