简体   繁体   English

如何使用itext7从PDF提取附件

[英]How to extract attached files from PDF with itext7

How does one extract attached files from a PDF with itext7? 如何使用itext7从PDF中提取附件?

The sample codes I found for itext5 all don't work any more. 我为itext5找到的示例代码不再起作用。

A byte[] per file would be what I need, as in the itext5 example below: 我需要每个文件一个byte[] ,如下面的itext5示例所示:

    PdfReader reader = new PdfReader(SRC);
    Map<String, byte[]> files = new HashMap<String,byte[]>();
    PdfObject obj;

    for (int i = 1; i <= reader.getXrefSize(); i++) {
        obj = reader.getPdfObject(i);
        if (obj != null && obj.isStream()) {
            PRStream stream = (PRStream)obj;
            byte[] b;
            try {
                b = PdfReader.getStreamBytes(stream);
            }
            catch(UnsupportedPdfException e) {
                b = PdfReader.getStreamBytesRaw(stream);
            }
            files.put(Integer.toString(i), b);
        }
    }

Thx /markus 特克斯/马克斯

You are searching for attachments using brute force instead of by querying the catalog for embedded files and querying page dictionaries for attachment annotations. 您正在使用蛮力搜索附件,而不是通过查询目录中的嵌入式文件并查询页面词典中的附件注释。

Anyway, if I'd port your code to iText 7, it would look like this: 无论如何,如果我要将您的代码移植到iText 7,它将看起来像这样:

PdfDocument pdfDoc = new PdfDocument(new PdfReader(SRC));
PdfObject obj;
for (int i = 1; i <= pdfDoc.getNumberOfPdfObjects(); i++) {
    obj = pdfDoc.getPdfObject(i);
    if (obj != null && obj.isStream()) {
        byte[] b;
        try {
            b = ((PdfStream) obj).getBytes();
        } catch (PdfException exc) {
            b = ((PdfStream) obj).getBytes(false);
        }
        FileOutputStream fos = new FileOutputStream(String.format(DEST, i));
        fos.write(b);
        fos.close();
    }
}
pdfDoc.close();

The only change I made, is that I write the stream to a file. 我所做的唯一更改是将流写入文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM