简体   繁体   中英

How to know with java whether file is corrupted (readable) or not?

I have web application where person can upload any pdf via FTP. After pdf file get uploaded I perform certain operations over that pdf.

But the problem here is, while uploading the PDF via FTP sometimes connection breaks up in between and the pdf uploaded is not complete (act like corrupted one). When I try to open that document in arobat reader it gives message ' There was an error opening the document. The file is damaged and could not be repaired '.

Now before starting processing over PDF, I want to check whether pdf uploaded is readable means no corrupted.

Do java provide any API for that, or there is any method to check whether file is corrupted or not.

We have iText API in Java to work on PDF files.

To check if a PDF file is valid to load and read, use com.itextpdf.text.pdf.PdfReader .
If the file is corrupted, an exception like com.itextpdf.text.exceptions.InvalidPdfException , is thrown.

Sample code snippet :

...  
import com.itextpdf.text.pdf.PdfReader;  
...  
try {  
    PdfReader pdfReader = new PdfReader( pathToUploadedPdfFile );  

    String textFromPdfFilePageOne = PdfTextExtractor.getTextFromPage( pdfReader, 1 ); 
    System.out.println( textFromPdfFilePageOne );
}  
catch ( Exception e ) {  
    // handle exception  
}  

In case of uploaded but corrupted files, you may face the following error:

com.itextpdf.text.exceptions.InvalidPdfException: Rebuild failed:   
  trailer not found.; Original message: PDF startxref not found.  

Note : To produce such an exception, try saving a pdf file from net, but abort it in the middle.
Use it to load through above code snippet and check if it is loaded safe.

You can find detailed examples on iText API at

Use Case Examples of iText PDF | iText .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM