简体   繁体   中英

Read data from a PDF document that does not have an XFA-form

I use iText to read a PDF document containing an XFA form. I convert it to XML, read data from the XML and insert it in a datatbase. But if I dont have an XFA form in the PDF then how I can efficiently read data from the PDF?

It depends on your expectations.

  • You can use text extraction to retrieve all the text on a certain page. How you then process the text is up to you. (eg regular expressions)

  • You can also opt for using pdf2Data, an iText7 add-on that allows you to match documents against templates. pdf2Data seems like a good fit, since it produces XML files as its output.

More information on pdf2Data can be found here http://itextpdf.com/itext7/pdf2Data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM