简体   繁体   中英

How to replace a text in pdf file with ITextPDF library?

I have a requirement to replace a placeholder like ${placeholder} with an actual value, but I could not find any working solution... I've ben following by https://itextpdf.com/en/resources/examples/itext-7/replacing-pdf-objects and it doesn't work. Does anybody know how to do it?

In general, it's not so easy to "replace" the content of a pdf file, since it could have been written in a different way. For example, suppose that you want to replace a chunk "Hello" with a chunk "World". You'd be lucky if "Hello" has been written to a pdf as a whole word. It might have been written as "He" and "llo", or even "o", "l" , "l", "e", "H", and the letters migth be placed in a different parts of the content stream.

However one can remove the content and then place some other content on the same place.

Let's look at how it could be done.

1) I advice you to use iText's pdfSweep , since this tool is able to detect the areas on which the content has been placed and remove the content (it's important to mention that pdfSweep doesn't hide content, it removes it completely)

Please look at the next sample: https://github.com/itext/i7j-pdfsweep/blob/develop/src/test/java/com/itextpdf/pdfcleanup/BigDocumentAutoCleanUpTest.java

Let's discuss redactTonySoprano test. As you can see, one can provide some regexes (for example, ""Tony( |_)Soprano", "Soprano" and "Sopranoes") and iText will redact all the matches of the content.

Then you can just write some text upon these areas using iText either via lowlevel api (PdfCanvas) or via more complex highlevel api (Canvas, etc).

Let's modify the soprano sample I've mentioned before a bit:

2) Let's add some text upon the redacted areas:

for (IPdfTextLocation location : strategy.getResultantLocations()) {
PdfPage page = pdf.getPage(location.getPageNumber()+1);
PdfCanvas pdfCanvas = new PdfCanvas(page.newContentStreamAfter(), page.getResources(), page.getDocument());
Canvas canvas = new Canvas(pdfCanvas, pdf, location.getRectangle());
canvas.add(new Paragraph("SECURED").setFontSize(8));

}

The result is not ideal, but that is just a proof of concept. It's possible to override the extraction strategies and define the font of the redacted content, so that it could be used for the new text to be placed on the redacted area.

Sample code below for replace content in PDF using iText

    File dir = new File("./");
    File [] files = dir.listFiles(new FilenameFilter() {
        @Override
        public boolean accept(File dir, String name) {
            return name.endsWith(".pdf");
        }
    });

    for (File pdffile : files) {
        System.out.println(pdffile.getName());
        PdfReader reader = null;
        reader = new PdfReader(pdffile.toString()); 

      PdfDictionary dict = reader.getPageN(1);
      PdfObject object = dict.getDirectObject(PdfName.CONTENTS);
      if (object instanceof PRStream) {
          PRStream stream = (PRStream)object;
          byte[] data = PdfReader.getStreamBytes(stream);
          String dd = new String(data);
          dd = dd.replace("0 0 0 rg\n()Tj", "0 0 0 rg\n(Plan Advanced Payment)Tj");
          System.out.print(dd);
          stream.setData(dd.getBytes());
      }
      PdfStamper stamper = new PdfStamper(reader,
      new FileOutputStream("./output/"+pdffile.getName())); // output PDF
      stamper.close();
      reader.close();
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM