简体   繁体   English

使用C#检查pdf文件是否损坏

[英]check if a pdf file is corrupted using C#

We have an application that generates pdf files, some times for some unknown reason, one of the pdf files gets corrupted, that is it is created corrupted, we need to check if this pdf is corrupted or not before continuing to other pdfs, if it is corrupted we need to create it again. 我们有一个生成pdf文件的应用程序,有时由于某种未知的原因,其中一个pdf文件已损坏,即它被创建为损坏,我们需要先检查该pdf文件是否损坏,然后再继续查找其他pdf文件(如果有)已损坏,我们需要再次创建它。

Thanks 谢谢

Look at PDF Parsers and try to use them to detect the corruption. 查看PDF解析器,并尝试使用它们来检测损坏。 For example, ghostscript . 例如, ghostscript

Disclaimer: I work for Atalasoft 免责声明:我为Atalasoft工作

In DotImage Document Imaging , we include some PDF Parsing classes that will throw if the file is corrupt. DotImage Document Imaging中 ,我们包括一些PDF解析类,如果文件损坏,它们将抛出该类

If you add our PDF Reader add-on, we will try to rasterize the PDF -- if it's corrupt, that will throw. 如果添加我们的PDF Reader加载项,我们将尝试栅格化PDF-如果损坏,则会抛出该错误。 If the problem is missing pieces, then you can look for them in the resulting image. 如果问题是缺少片段,则可以在生成的图像中查找它们。

You can check Header PDF like this: 您可以像这样检查Header PDF:

public bool IsPDFHeader(string fileName)    
{

    byte[] buffer = null;
    FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);
    BinaryReader br = new BinaryReader(fs);

    long numBytes = new FileInfo(fileName).Length;
    //buffer = br.ReadBytes((int)numBytes);
    buffer = br.ReadBytes(5);

    var enc = new ASCIIEncoding();
    var header = enc.GetString(buffer);

    //%PDF−1.0
    // If you are loading it into a long, this is (0x04034b50).
    if (buffer[0] == 0x25 && buffer[1] == 0x50
        && buffer[2] == 0x44 && buffer[3] == 0x46)
    {
        return header.StartsWith("%PDF-");
    }
    return false;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM