iTextSharp PDF Read Error

Question

I have below code

using (var reader = new PdfReader(pdfPath))
{
    for (int pageIndex = 1; pageIndex <= reader.NumberOfPages; pageIndex++)
    {
        var text = PdfTextExtractor.GetTextFromPage(reader, pageIndex);
        //my other logic goes here
    }
}

I'm getting value cannot be null at the line

using (var reader = new PdfReader(pdfPath))

I'm not sure why it fails for a few PDFs. I'm able to read 100s of PDFs but only 4 PDFs, I get this error.

Error:

System.ArgumentNullException: Value cannot be null.
Parameter name: key
   at System.Collections.Generic.Dictionary`2.FindEntry(TKey key)
   at System.Collections.Generic.Dictionary`2.TryGetValue(TKey key, TValue& value)
   at System.util.collections.HashSet2`1.AddAndCheck(T item)
   at iTextSharp.text.pdf.PdfReader.PageRefs.IteratePages(PRIndirectReference rpage)
   at iTextSharp.text.pdf.PdfReader.PageRefs.IteratePages(PRIndirectReference rpage)
   at iTextSharp.text.pdf.PdfReader.PageRefs.IteratePages(PRIndirectReference rpage)
   at iTextSharp.text.pdf.PdfReader.PageRefs.IteratePages(PRIndirectReference rpage)
   at iTextSharp.text.pdf.PdfReader.PageRefs.IteratePages(PRIndirectReference rpage)
   at iTextSharp.text.pdf.PdfReader.PageRefs.ReadPages()
   at iTextSharp.text.pdf.PdfReader.PageRefs..ctor(PdfReader reader)
   at iTextSharp.text.pdf.PdfReader.ReadPages()
   at iTextSharp.text.pdf.PdfReader.ReadPdf()
   at iTextSharp.text.pdf.PdfReader..ctor(IRandomAccessSource byteSource, Boolean partialRead, Byte[] ownerPassword, X509Certificate certificate, ICipherParameters certificateKey, Boolean closeSourceOnConstructorError)
   at iTextSharp.text.pdf.PdfReader..ctor(String filename)

My iTextSharp version is 5.5.7.0

Answer 1

The simplest reason would be that on those 4 PDFs, pdfPath is null instead of a string. Check for a null value in pdfPath.

Answer 2

4个PDF的路径可能无效，这意味着那里没有PDF文件。

Answer 3

Just to close this topic, I have requested the PDF supplier to regenerate the files in question for me. They did re-generate and send me and I'm able to process them without any code changes. It appears that there is something wrong in the PDF content that is not readable by iTextSharp properly. I still wonder because there was no change in their process and in our process. It may be a corrupted PDF somewhere.

iTextSharp PDF Read Error

Question

3 answers

solution1
0 2015-11-02 21:37:40

solution2
0 2015-11-05 21:01:55

solution3
0 ACCPTED 2015-11-06 17:05:07

iTextSharp PDF Read Error

Question

3 answers

solution1 0 2015-11-02 21:37:40

solution2 0 2015-11-05 21:01:55

solution3 0 ACCPTED 2015-11-06 17:05:07

solution1
0 2015-11-02 21:37:40

solution2
0 2015-11-05 21:01:55

solution3
0 ACCPTED 2015-11-06 17:05:07