简体   繁体   中英

PDFBOX 1.8.10. Error in generating PDDocument from load() method

I am using PDFBOX 1.8.10.

If I load the PDF File into byte array, it works -

File file = new File(args[0]);
FileInputStream fis = new FileInputStream(file);   //Normal PDF File
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
try {
    for (int readNum; (readNum = fis.read(buf)) != -1;) {
        bos.write(buf, 0, readNum); //no doubt here is 0
    }
} catch (IOException ex) {
    ex.printStackTrace();
}
byte[] bytes = bos.toByteArray();
CheckIsPDF(bytes);
pdf = PDDocument.load(new ByteArrayInputStream(bytes)); //**No exception here**

But if the same file is stored in a database and If i try to read it through the above code, I get the following exception- "java.io.IOException: Error: End-of-File, expected line" .

This is the code to read from DB and populate the PDF-

List<byte[]> forms; //this gets populated from database. The data stored in DB is HEX.
for(byte[] file : forms){
    try{
        int var=file.length;

        pdDocument = PDDocument.load(new ByteArrayInputStream(file)); //**Exception** 

        fieldLists = PDFFormUtils.printFields( pdDocument );

    }
    catch(Exception e){
        e.printStackTrace();
    }
}

As discussed in the comments, the cause of the problem was that the content of the blob wasn't a PDF. The blob content is:

43 3a 5c 4d 42 43 50 4f 53 5c 52 65 6e 74 2e 70 64 66

A pdf starts with "%PDF", so in hex this would be

25 50 44 46

The hex sequence you mention translates to

C:\MBCPOS\Rent.pdf

which means that somebody saved the file name instead of the file contents into the blob.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM