简体   繁体   English

合并多个 Pdf 文件时出现 Pdfsharp 内存不足异常

[英]Pdfsharp Out of Memory Exception when Combine Multi Pdf File

I have to convert into a single pdf a large number (but undefined) pdf into one for this, I'm using the code PDFsharp here.为此,我必须将大量(但未定义)的 pdf 转换为单个 pdf,我在这里使用代码 PDFsharp。

    // Get some file names
    string[] files = filesToPrint.ToArray();

    // Open the output document
    PdfDocument outputDocument = new PdfDocument();

    PdfPage newPage; 

    int nProcessedFile = 0;
    int nMemoryFile = 5;
    int nStepConverted = 0;
    String sNameLastCombineFile = ""; 


    // Iterate files
    foreach (string file in files)
    {
        // Open the document to import pages from it.
        PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);

        // Iterate pages
        int count = inputDocument.PageCount;
        for (int idx = 0; idx < count; idx++)
        {
            // Get the page from the external document...
            PdfPage page = inputDocument.Pages[idx];
            // ...and add it to the output document.
            outputDocument.AddPage(page);                                
        }

        nProcessedFile++;
        if (nProcessedFile >= nMemoryFile)
        {
            //nProcessedFile = 0;
            //nStepConverted++;
            //sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";

            //outputDocument.Save(sNameLastCombineFile);
            //outputDocument.Close();                 
        }
    }
    // Save the document...
    const string filename = "ConcatenatedDocument1_tempfile.pdf";
    outputDocument.Save(filename);
    // ...and start a viewer.
   Process.Start(filename);

For small numbers of files the code works but then at some point generates an exception of out of memory对于少量文件,代码可以工作,但在某些时候会产生内存不足的异常

is there a solution?有解决办法吗?

ps I was thinking of saving the files in step and then the remaining aggiungingere so liebrare memory but I can not find the way. ps 我正在考虑逐步保存文件,然后将剩余的 aggiungingere 保存在内存中,但我找不到方法。

UPDATE1:更新1:

if (nProcessedFile >= nMemoryFile)
{
nProcessedFile = 0;
//nStepConverted++;
sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";

outputDocument.Save(sNameLastCombineFile);
outputDocument.Close();

outputDocument = PdfReader.Open(sNameLastCombineFile,PdfDocumentOpenMode.Modify);
}

UPDATE 2 versione 1.32 Complete example Error on line: PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import); UPDATE 2 versione 1.32 完整示例在线错误: PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);

Text error: Cannot handle iref streams.文本错误:无法处理 iref 流。 The current implementation of PDFsharp cannot handle this PDF feature introduced with Acrobat 6. PDFsharp 的当前实现无法处理 Acrobat 6 引入的此 PDF 功能。

using PdfSharp.Pdf;
using PdfSharp.Pdf.IO;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            List<String> filesToPrint = new List<string>();

            filesToPrint = Directory.GetFiles(@"D:\Downloads\RACCOLTA\FILE PDF", "*.pdf").ToList();

            // Get some file names
            string[] files = filesToPrint.ToArray();

            // Open the output document
            PdfDocument outputDocument = new PdfDocument();

            PdfPage newPage;

            int nProcessedFile = 0;
            int nMemoryFile = 5;
            int nStepConverted = 0;
            String sNameLastCombineFile = "";

            try
            {
                // Iterate files
                foreach (string file in files)
                {
                    // Open the document to import pages from it.
                    PdfDocument inputDocument = PdfReader.Open(file, PdfDocumentOpenMode.Import);

                    // Iterate pages
                    int count = inputDocument.PageCount;
                    for (int idx = 0; idx < count; idx++)
                    {
                        // Get the page from the external document...
                        PdfPage page = inputDocument.Pages[idx];
                        // ...and add it to the output document.
                        outputDocument.AddPage(page);
                    }

                    nProcessedFile++;
                    if (nProcessedFile >= nMemoryFile)
                    {
                        nProcessedFile = 0;
                        //nStepConverted++;
                        sNameLastCombineFile = "ConcatenatedDocument" + nStepConverted.ToString() + " _tempfile.pdf";

                        outputDocument.Save(sNameLastCombineFile);
                        outputDocument.Close();

                        inputDocument = PdfReader.Open(sNameLastCombineFile , PdfDocumentOpenMode.Modify);
                    }
                }
                // Save the document...
                const string filename = "ConcatenatedDocument1_tempfile.pdf";
                outputDocument.Save(filename);
                // ...and start a viewer.
                Process.Start(filename);

            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
                Console.ReadKey();

            }
        }
    }
}

UPDATE3 Code that generate exception out of memory UPDATE3 产生内存不足异常的代码

            int count = inputDocument.PageCount;
            for (int idx = 0; idx < count; idx++)
            {
                // Get the page from the external document...
                newPage = inputDocument.Pages[idx];
                // ...and add it to the output document.
                outputDocument.AddPage(newPage);

                newPage.Close();
            }

I can not exactly which row general exception我不能确切地是哪一行一般异常

I had a simular issue, saving, closing and reopening the PdfDocument did not really help.我有一个类似的问题,保存、关闭和重新打开 PdfDocument 并没有真正帮助。

I am adding al lot (100+) large (upto 5Mb) images (tiff, jpg, etc) to a pdf document where every images has its own page.我将所有(100+)大(高达 5Mb)图像(tiff、jpg 等)添加到 pdf 文档中,其中每个图像都有自己的页面。 It crashed around image #50.它在图像#50 附近坠毁。 After the save-close-reopen it did finish the whole document but was still getting close to max memory, around 3Gb.在保存关闭重新打开之后,它确实完成了整个文档,但仍然接近最大内存,大约 3Gb。 Some more images and it would still crash.再多一些图像,它仍然会崩溃。

After more refining, I implemented a using for the XGraphics object, it was a little better again but not much.经过更多的改进,我为 XGraphics 对象实现了一个 using,它再次好一点但不多。

The big step forward was disposing of the XImage within the loop!向前迈出的一大步是在循环中处理 XImage! After that the application never used more than 100-200Kb, I removed the save-close-reopen for the PdfDocument and it was no problem.在那之后,应用程序从未使用超过 100-200Kb,我删除了 PdfDocument 的 save-close-reopen 并且没有问题。

After saving and closing outputDocument (the code is commented out in your snippet), you have to open outputDocument again, using PdfDocumentOpenMode.Modify .保存并关闭后outputDocument (代码程式码中注释掉),你必须打开outputDocument再次,使用PdfDocumentOpenMode.Modify

It could help to add using(...) for the inputDocument .它可以帮助为inputDocument添加using(...)

If your code is running as a 32-bit process, then switching to 64 bit will allow your process to use more than 2 GB of RAM (assuming your computer has more than 2 GB RAM).如果您的代码作为 32 位进程运行,那么切换到 64 位将允许您的进程使用超过 2 GB 的 RAM(假设您的计算机具有超过 2 GB 的 RAM)。

Update: The message "Cannot handle iref streams" means you have to use PDFsharp 1.50 Prerelease, available on NuGet.更新:消息“无法处理 iref 流”意味着您必须使用 NuGet 上提供的 PDFsharp 1.50 Prerelease。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM