使用iText将具有表单的PDF转换为仅具有文本的PDF（保留数据）

Question

I have multiple PDFs that get populated with multiple records (a.pdf,b.pdf,c[0-9].pdf,d[0-9].pdf,ez.pdf) using acroforms and pdfbox. 我有多个使用acroforms和pdfbox填充多个记录（a.pdf，b.pdf，c [0-9] .pdf，d [0-9] .pdf，ez.pdf）的PDF。
The resulting files (aflat.pdf,bflat.pdf,c[0-9]flat.pdf,d[0-9]flat.pdf,ezflat.pdf) should have their forms(dictionaries and whatever adobe uses) removed but the fields filled as raw text saved on the pdf (setReadOnly is not what I want!). 生成的文件（aflat.pdf，bflat.pdf，c [0-9] flat.pdf，d [0-9] flat.pdf，ezflat.pdf）应删除其格式（字典和任何Adobe使用的格式），但填充为原始文本的字段保存在pdf上（setReadOnly不是我想要的！）。

PdfStamper can only remove fields without saving their content but I've found some references to PdfContentByte as a way to save the content. PdfStamper只能删除字段而不保存其内容，但是我发现一些对PdfContentByte的引用可以用来保存内容。 Alas, the documentation is too brief to understand how I should do this. las，文档太简短，无法理解我应该怎么做。

As a last resort I could use FieldPosition to write directly on the PDF. 作为最后的选择，我可以使用FieldPosition直接在PDF上编写。 Has anyone ever encountered such problem? 有没有人遇到过这样的问题？ How do I solve it? 我该如何解决？

UPDATE : Saving a single page of b.pdf yields a valid bfilled.pdf but a blank bflattened.pdf . 更新： 保存b.pdf的一页会产生有效的bfilled.pdf，但空白的bflattened.pdf 。 Saving the whole document solved the issue. 保存整个文档解决了该问题。

    populateB();
    try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
        //importing the page will corrupt the fields
        /*wrong approach*/doc.importPage((PDPage)pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
        /*wrong approach*/doc.save(stream);
        //save the whole document instead
        pdfDocuments.get(0).save(stream);//<---right approach

    }
    try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
        PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
        stamper.setFormFlattening(true);
        stamper.close();
    }

Answer 1

使用PdfStamper.setFormFlattening(true)摆脱字段并将它们写为内容。

Answer 2

Always use the whole page when working with acroforms 使用acroform时始终使用整个页面

    populateB();
try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
    //importing the page will corrupt the fields
    doc.importPage((PDPage) pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
    doc.save(stream); 
    //save the whole document instead
    pdfDocuments.get(0).save(stream);

}
try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
    PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
    stamper.setFormFlattening(true);
    stamper.close();
}

使用iText将具有表单的PDF转换为仅具有文本的PDF（保留数据）

问题描述

2 个解决方案

解决方案1
3 2015-02-24 11:34:35

解决方案2
1 2015-02-24 15:14:51

使用iText将具有表单的PDF转换为仅具有文本的PDF（保留数据）

问题描述

2 个解决方案

解决方案1 3 2015-02-24 11:34:35

解决方案2 1 2015-02-24 15:14:51

解决方案1
3 2015-02-24 11:34:35

解决方案2
1 2015-02-24 15:14:51