简体   繁体   中英

How do I remove watermark Xobject from pdf?

I'd like to remove watermark from pdf file. It is probably created by software developed by Acrobat.

The books belongs to me. It is available to anyone who has access to academic service called EBSCO. Many academic libraries have it; so my library. I downloaded the book and I want to print some part of it without annoying watermarks.

"ADBE_CompoundType" Editable watermarks (headers, footers, stamps) created by Acrobat Information taken from here .

I used PdfContentStreamEditor class for pdfbox created by mkl and published at SO as an answer to a question . I override one method. Here it is:

@Override
protected void write(final ContentStreamWriter contentStreamWriter,
    final Operator operator,
    final List < COSBase > operands) throws IOException {

    if (isWatermark(operator, operands)) {

        final COSName xObjectName = COSName.getPDFName("Fm0");
        final PDXObject fm0 = page.getResources().getXObject(xObjectName);
        if (fm0 != null) {
            final COSObject pieceInfo = fm0.getCOSObject()
                .getCOSObject(COSName.getPDFName("PieceInfo"));
            if (pieceInfo != null) {
                final COSBase adbeCompoundType = pieceInfo.getDictionaryObject(
                    COSName.getPDFName("ADBE_CompoundType"));
                if (adbeCompoundType != null) {
                    final COSBase privateKey = ((COSDictionary) adbeCompoundType)
                        .getDictionaryObject("Private");
                    if ("Watermark".equals(((COSName) privateKey).getName())) {
                        final PDResources resources = page.getResources();
                        resources.getCOSObject().removeItem(xObjectName);
                        page.getResources().getCOSObject().setNeedToBeUpdated(true);
                        return;
                    }
                }
            }
        }
    }
    super.write(contentStreamWriter, operator, operands);
}

And helper method:

private boolean isWatermark(final Operator operator,
    final List < COSBase > operands) {
    final String operatorString = operator.getName();
    return operatorString.equals("Do") &&
        operands.size() == 1 && ((COSName) operands.get(0)).getName().equals("Fm0");
}

The code seems to work fine - no watermark is shown on any page. However, I cannot get rid of of the object with watermark. I tried to remove it with the following lines of code, unfortunately the object is not removed.

final PDResources resources = page.getResources(); resources.getCOSObject().removeItem(xObjectName); page.getResources().getCOSObject().setNeedToBeUpdated(true);

Here's a screenshot from pdfdebugger with watermark object:

在此处输入图像描述

And here's the watermark text. I couldn't find out how to check whether a watermark object contains this text and I'd like to know how to do this.

在此处输入图像描述

And here's one page of the pdf file: link1 and link2

You try to remove the XObject Fm0 from the resources like this:

final PDResources resources = page.getResources();
resources.getCOSObject().removeItem(xObjectName);

Ie you fetch the COS (dictionary) object of the resources and try to remove the Fm0 (in xObjectName ) entry.

If you look closely at your screenshot , though, you'll see that the Fm0 entry is not in the Resources dictionary directly. Instead there is a nested XObject dictionary entry in which in turn is the Fm0 entry.

Thus, the following should work:

final PDResources resources = page.getResources();
COSDictionary dict = (COSDictionary) (resources.getCOSObject().getDictionaryObject(COSName.XOBJECT));
dict.removeItem(xObjectName);

PDResources has some helper methods, so the following should also work:

page.getResources().put(xObjectName, (PDXObject)null);

You mention that the book belongs to you and you, therefore, are entitled to remove the watermark. That is not automatically the case. Depending on the laws (global and local) and the contracts applicable you may only have acquired the right to use the book in its current form, including the watermark. Please make sure you understand the restrictions under which you may use the book.

Also I wonder why you want to get rid of that XObject if the watermark does not show anymore and you merely wanted to change the file to print without the watermark...

Althought mkl has answered this question, I'd like to share a solution using iText library despite the fact I prefer pdfbox over iText as the former is provided free of charge. iText code is less verbose than that of pdfbox . This is because when the watermark object is removed it is automatically not shown on any page.

for (int i = 1; i <= document.getNumberOfPages(); i++) {
    final PdfPage page = document.getPage(i);
    final PdfDictionary xObject = page.getResources().getResource(PdfName.XObject);
    if (xObject != null) {
        final PdfStream fm0 = xObject.getAsStream(new PdfName("Fm0"));
        if (fm0 != null) {
            final PdfDictionary pieceInfo = fm0.getAsDictionary(new PdfName("PieceInfo"));
            if (pieceInfo != null) {
                final PdfDictionary adbeCompoundType = pieceInfo.getAsDictionary(
                    new PdfName("ADBE_CompoundType"));
                if (adbeCompoundType != null) {
                    final PdfName privateKey = adbeCompoundType.getAsName(PdfName.Private);
                    if (privateKey != null) {
                        if ("Watermark".equals(privateKey.getValue())) {
                            xObject.remove(new PdfName("Fm0"));
                        }
                    }
                }
            }
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM