简体   繁体   中英

How to extract the data from a pdf File using iText

I am working on a program that extracts data from a PDF file and I am using iText as java library. When I try to open the file using this method:

public static void main(String[] args) {
    try {
        // TODO code application logic here

        PdfReader pr=new FdfReader("C:\\Users\\saviour\\Desktop\\doc308-999.pdf");

        String str=PdfTextExtractor.getTextFromPage(pr, 2); 
        System.out.println(str);

    } catch (IOException ex) {
        Logger.getLogger(PDFTests.class.getName()).log(Level.SEVERE, null, ex);
    }

}

I have this Error:

com.itextpdf.text.exceptions.InvalidPdfException: FDF header signature not found.
    at com.itextpdf.text.pdf.PRTokeniser.checkFdfHeader(PRTokeniser.java:215)
    at com.itextpdf.text.pdf.FdfReader.readPdf(FdfReader.java:95)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:169)
    at com.itextpdf.text.pdf.PdfReader.<init>(PdfReader.java:158)
    at com.itextpdf.text.pdf.FdfReader.<init>(FdfReader.java:63)
    at pdftests.PDFTests.main(PDFTests.java:39)

So I am asking about the purpose of this Exception:) Thank you guys.

There is a possibility that the pdf file you are trying to load is either not a pdf file. Or it does not contain an FDF Form. Please read javadoc on InvalidPdfException .

You can try the following code change and get expected result.
Change PdfReader pr=new FdfReader("C:\\Users\\saviour\\Desktop\\doc308-999.pdf");
to PdfReader pr=new PdfReader("C:\\Users\\saviour\\Desktop\\doc308-999.pdf");

References:

Try changing the file location. Sometimes OS does not allow file to be read from some system drives by other applications. Put somewhere in D: etc.

Also make sure you have sufficient pages in the PDF. (atleast 2 pages since you are reading 2nd page) or try with parser.getTextFromPage(1) etc. to get content from other pages.

You can have a more look here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM