简体   繁体   English

使用itextsharp合并文件夹中的pdf文件

[英]Using itextsharp to merge pdf files within a folder

I'm trying to use codes below to merge the pdf files in a folder and output into a new file but apparently the generated file seems corrupted. 我正在尝试使用以下代码合并文件夹中的pdf文件并输出到新文件中,但是显然生成的文件似乎已损坏。

public Boolean MergeForm(String destinationFile, String sourceFolder)
    {
        try
        {
            using (MemoryStream stream = new MemoryStream())
            using (Document doc = new Document())
            using (PdfCopy pdf = new PdfCopy(doc, stream))
            {
                doc.Open();

                PdfReader reader = null;
                PdfImportedPage page = null;

                foreach (var file in Directory.GetFiles(sourceFolder))
                {
                    reader = new PdfReader(file);
                    for (int i = 0; i < reader.NumberOfPages; i++)
                    {
                        page = pdf.GetImportedPage(reader, i + 1);
                        pdf.AddPage(page);
                    }

                    pdf.FreeReader(reader);
                    reader.Close();
                }
                using (FileStream streamX = new FileStream(destinationFile, FileMode.Create))
                {
                    stream.WriteTo(streamX);
                }
            }
            return true;
        }
        catch (Exception)
        {
            return false;
        }
    }

Can anyone spot on where's the problem? 谁能找到问题所在? Thank you. 谢谢。

Can anyone spot on where's the problem? 谁能找到问题所在?

Your main problem is that you use the contents of the MemoryStream before the Document and PdfCopy have finished creating the PDF (during the Dispose at the end of the using block). 您的主要问题是在DocumentPdfCopy完成创建PDF之前(在using块末尾的Dispose期间)使用MemoryStream的内容。 Thus, you save an incomplete PDF file as a result. 因此,您将保存不完整的PDF文件。

Doing it like this instead should work: 像这样做应该可以工作:

    using (MemoryStream stream = new MemoryStream())
    {
        using (Document doc = new Document())
        {
            PdfCopy pdf = new PdfCopy(doc, stream);
            pdf.CloseStream = false;
            doc.Open();

            PdfReader reader = null;
            PdfImportedPage page = null;

            foreach (var file in Directory.GetFiles(sourceFolder))
            {
                reader = new PdfReader(file);
                for (int i = 0; i < reader.NumberOfPages; i++)
                {
                    page = pdf.GetImportedPage(reader, i + 1);
                    pdf.AddPage(page);
                }

                pdf.FreeReader(reader);
                reader.Close();
            }
        }
        using (FileStream streamX = new FileStream(destinationFile, FileMode.Create))
        {
            stream.WriteTo(streamX);
        }
    }

BTW, you also see here that I did not put PdfCopy into a using block. 顺便说一句,您还在这里看到我没有将PdfCopy放入using块中。 This is because the Document implicitly closes the PDFCopy when it is disposed. 这是因为DocumentPDFCopy隐式关闭PDFCopy First disposing the PdfCopy and then the Document (which tries to close the PdfCopy again), therefore, is not necessary and can result in hiding exceptions thrown from within the block by other exceptions occurring in this closing circus. 因此, PdfCopy ,然后再PdfCopy Document (试图再次关闭PdfCopy ),这是不必要的,并且可能导致隐藏在块中由该关闭马戏团中发生的其他异常引发的异常。

Furthermore I needed to add the pdf.CloseStream = false , otherwise the memory stream would have been closed when the PdfCopy is closed. 此外,我需要添加pdf.CloseStream = false ,否则在关闭PdfCopy时将关闭内存流。


That been said, 话虽这么说,

  1. Of course you should also use AddDocument instead of iterating over the document pages yourself as already explained by @Bruno. 当然,您还应该使用AddDocument而不是像@Bruno所解释的那样自己遍历文档页面。
  2. Your memory footprint would decrease if you immediately wrote to the file stream instead of the memory stream. 如果立即写入文件流而不是内存流,则内存占用空间将减少。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM