简体   繁体   English

获取使用 itextsharp 签名的 pdf 的原始内容

[英]get original content of a pdf signed with itextsharp

I'm trying to get the original document of a signed PDF in order to compare it's hash with an stored doc.我正在尝试获取签名 PDF 的原始文档,以便将它的哈希值与存储的文档进行比较。

This is really easy when the document has several signatures, with acrobat reader you can go the previous revision of the document save it and that's it.当文档有多个签名时,这真的很容易,使用 acrobat 阅读器,您可以将文档的先前修订版保存下来,仅此而已。

Surprisingly this does not work with the first signature, where there is no straight forward way to get the original data.令人惊讶的是,这不适用于第一个签名,其中没有直接的方法来获取原始数据。

As it is not possible to do it with the reader I have tried programatically with iTextSharp.由于无法与阅读器一起使用 iTextSharp 以编程方式尝试。 However although I have googled deeply I have not found how to do it.然而,尽管我已经深入谷歌搜索,但我还没有找到如何去做。 The only relevant post I found is this one but no solution is offered.唯一相关的文章中,我发现这是一个,但没有解决方案提供。

Has anyone faced this problem and found a solution?有没有人遇到过这个问题并找到了解决方案?

Thanks in advance.提前致谢。

EDIT: I put here the code that extracts the data based on the response of mkl.编辑:我把根据 mkl 的响应提取数据的代码放在这里。 Read the comments of the response to beware of the problem with the unfixed length of the non signed PDFs.阅读回复的评论以注意未签名 PDF 的长度不固定的问题。

String sOriginalText = File.ReadAllText("FileSigned.pdf", Encoding.Default);
int sTrailerNumberPosition = sOriginalText.LastIndexOf("]/Prev ") + "]/Prev ".Length;
int sTrailerNumberEndPosition = sOriginalText.IndexOf(">", sTrailerNumberPosition);
String sTrailerIndex = sOriginalText.Substring(sTrailerNumberPosition, sTrailerNumberEndPosition -sTrailerNumberPosition);
int iTrailerIndexPosition = sOriginalText.IndexOf(sTrailerIndex + "\r\n%%EOF");
int iEndPosition = sOriginalText.IndexOf("%%EOF", iTrailerIndexPosition) + "%%EOF".Length;
String sOutText = sOriginalText.Substring(0, iEndPosition);
File.WriteAllText("c:/OriginalFile.pdf", sOutText, Encoding.Default);

Whether or not your task to get the original document of a signed PDF is realizable at all, depends on how the signature originally was applied.获取已签名 PDF 原始文档的任务是否完全可以实现,取决于最初应用签名的方式。

  1. If the signature was applied in append mode (ie according to the language of the PDF specification ISO 32000-1:2008 as an incremental update , cf. section 7.5.6), you merely have to cut off this appended, incremental update revision.如果签名是在附加模式下应用的(即根据 PDF 规范ISO 32000-1:2008的语言作为增量更新,请参见第 7.5.6 节),您只需切断此附加的增量更新修订版。

    As you have a stored document which presumably after signing has become the document you inspect, you can simply cut the signed file at the length of the stored one and the compare, eg using hashes.由于您有一个存储的文档,大概在签名后已成为您检查的文档,您可以简单地按照存储的长度剪切签名文件并进行比较,例如使用散列。 This suffices to show that the signed document is derived from your original one.这足以表明签署的文件来自您的原始文件。 There may have been other, intermediary revisions, though, as you might just have cut off multiple incremental updates.但是,可能还有其他中间修订,因为您可能只是切断了多个增量更新。

    In general you can find the prior revision by following the /Prev trailer entry of your signed PDF to the cross reference table of the prior revision and from there move onwards to the document end marker %%EOF because in an incremental update通常,您可以通过跟随已签名 PDF 的/Prev预告片条目到先前修订的交叉引用表,然后从那里移动到文档结束标记%%EOF来找到先前的修订,因为在增量更新中

    the added trailer shall contain all the entries except the Prev entry (if present) from the previous trailer, whether modified or not.添加的预告片应包含除前一个预告片中的 Prev 条目(如果存在)之外的所有条目,无论是否修改。 In addition, the added trailer dictionary shall contain a Prev entry giving the location of the previous cross-reference section (see Table 15).此外,添加的预告片词典应包含一个 Prev 条目,给出前一个交叉引用部分的位置(见表 15)。 Each trailer shall be terminated by its own end-of-file (%%EOF) marker.每个拖车应以其自己的文件结束 (%%EOF) 标记终止。

    In case of PDFs using cross reference streams instead of cross reference tables, there is the analogous entry in the cross-reference stream dictionary:如果 PDF 使用交叉引用流而不是交叉引用表,则交叉引用流字典中有类似的条目:

    (Present only if the file has more than one cross-reference stream; not meaningful in hybrid-reference files; see 7.5.8.4, "Compatibility with Applications That Do Not Support Compressed Reference Streams") The byte offset in the decoded stream from the beginning of the file to the beginning of the previous cross-reference stream. (仅当文件具有多个交叉引用流时才存在;在混合引用文件中没有意义;参见 7.5.8.4,“与不支持压缩引用流的应用程序的兼容性”)解码流中的字节偏移量从文件的开头到前一个交叉引用流的开头。 This entry has the same function as the Prev entry in the trailer dictionary (Table 15).该条目与预告片词典中的 Prev 条目具有相同的功能(表 15)。

    You should be aware, though, that the appended, incremental update revision can contain other changes in addition to the signature.但是,您应该知道,附加的增量更新修订版可以包含除签名之外的其他更改。 Thus, even if the previous revision corresponds with your stored document, you still only know that the signed document is based on your saved one.因此,即使先前的修订与您存储的文档相对应,您仍然只知道签名的文档是基于您保存的文档。

  2. If the signature was not applied in append mode, you are out of luck: Programs manipulating PDFs (eg for signing) might completely rearrange the binary contents of your document, possibly even renumbering objects, changing compression, removing unused objects, etc., while the appearance of the document remains the same.如果附加模式下应用签名,那么您就不走运了:操作 PDF(例如用于签名)的程序可能会完全重新排列文档的二进制内容,甚至可能重新编号对象、更改压缩、删除未使用的对象等,而文档的外观保持不变。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM