简体   繁体   中英

Validate to check uploaded file is pdf

How to validate if the file uploaded is PDF only? not only by extension(.pdf) but also with the content.If someone change the extension of any other file to pdf file then it should fail while uploading.

You can use Apache Tika for this, available here. http://tika.apache.org/

You can also find a practical example here: https://dzone.com/articles/determining-file-types-java

There are many way to validate PDF file. I used itext for check pdf is corrupted or not.

try {
        PdfReader pdfReader = new PdfReader(file);

        PdfTextExtractor.getTextFromPage(pdfReader, 1);

        LOGGER.info("pdfFileValidator ==> Exit");
        return true;
    } catch (InvalidPdfException e) {
        e.printStackTrace();
        LOGGER.error("pdfFileValidator ==> Exit. Error ==> " + e.getMessage());
        return false;
    }

If file is not PDF or file is corrupted than it will throw InvalidPDFException . For above example you need itext library.

There are many validation libraries that you can use in order to validate if a file is PDF compliant. For instance, you can use - veradpf or pdfbox . Of course you can use any other library that would do the work for you. As it was already mentioned, tika is another library that can read file metadata and tell you what the file is.

As an example (a bare one), you can do something with pdfbox . Also keep in mind that this will validate if the file is PDF/A compliant.

boolean validateImpl(File file) {

    PreflightDocument document = new PreflightParser(file).getPreflightDocument();

    try {
        document.validate();
        ValidationResult validationResult = document.getResult();

        if (validationResult.isValid()) {
            return true;
        }

    } catch (Exception e) {
       // Error validating
    }
    return false;
}

or with Tika, you can do something like

public ContentType tikaDetect(File file) {

    Tika tika = new Tika();

    String detectedType = tika.detect(file);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM