简体   繁体   中英

iText API for PDF comparison

Can I use iText API for comparing two PDF files? I have gone through various approaches on stackoverflow for comparing PDF files such as tools, some utilities such as imagemagick etc. The PDFs which I wish to compare are fiancial reports with graphs, tables and text etc. We have to compare a large number of files and would like to do it through command line utility. There is a ComparePDF command line tool but its just outputs whether two files are conatining differences. We will like to print a log of file differences. Can we accomplish this through iText?

What do you want to compare? iText could be used to compare structure and syntax, but... two different PDFs that look identical to the human eye, may have a completely different structure and syntax internally.

At iText, we have written JUnit tests that use GhostScript to create images of each page. These images are compared to each other on a pixel per pixel basis.

We also use iText in JUnit tests, but these tests look at the structure and the syntax more than at the content.

您需要使用Myers O(ND)diff算法进行PDF比较,itext或pdfbox api不提供pdf比较方法,您可以使用itext提取这些文件的文本并进行坐标处理,以后再使用Myers O(ND)diff找出差异并突出显示变化的算法。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM