简体   繁体   English

itextsharp 修剪 pdf 文档的页面

[英]itextsharp trimming pdf document's pages

I have a pdf document that has form fields that I'm filling out programatically with c#.我有一个 pdf 文档,其中包含我正在使用 c# 以编程方式填写的表单字段。 Depending on three conditions, I need to trim (delete) some of the pages from that document.根据三种情况,我需要修剪(删除)该文档中的一些页面。

Is that possible to do?那可能吗?

for condition 1: I need to keep pages 1-4 but delete pages 5 and 6对于条件 1:我需要保留第 1-4 页但删除第 5 和第 6 页

for condition 2: I need to keep pages 1-4 but delete 5 and keep 6对于条件 2:我需要保留第 1-4 页,但删除第 5 页并保留第 6 页

for condition 3: I need to keep pages 1-5 but delete 6对于条件 3:我需要保留第 1-5 页但删除第 6 页

Use PdfReader.SelectPages() combined with PdfStamper. 使用PdfReader.SelectPages()与PdfStamper结合使用。 The code below uses iTextSharp 5.5.1. 以下代码使用iTextSharp 5.5.1。

public void SelectPages(string inputPdf, string pageSelection, string outputPdf)
{
    using (PdfReader reader = new PdfReader(inputPdf))
    {
        reader.SelectPages(pageSelection);

        using (PdfStamper stamper = new PdfStamper(reader, File.Create(outputPdf)))
        {
            stamper.Close();
        }
    }
}

Then you call this method with the correct page selection for each condition. 然后使用每个条件的正确页面选择调用此方法。

Condition 1: 条件1:

SelectPages(inputPdf, "1-4", outputPdf);

Condition 2: 条件2:

SelectPages(inputPdf, "1-4,6", outputPdf);

or 要么

SelectPages(inputPdf, "1-6,!5", outputPdf);

Condition 3: 条件3:

SelectPages(inputPdf, "1-5", outputPdf);

Here's the comment from the iTextSharp source code on what makes up a page selection. 以下是关于构成页面选择的iTextSharp源代码的注释。 This is in the SequenceList class which is used to process a page selection: 这是在SequenceList类中,用于处理页面选择:

/**
* This class expands a string into a list of numbers. The main use is to select a
* range of pages.
* <p>
* The general systax is:<br>
* [!][o][odd][e][even]start-end
* <p>
* You can have multiple ranges separated by commas ','. The '!' modifier removes the
* range from what is already selected. The range changes are incremental, that is,
* numbers are added or deleted as the range appears. The start or the end, but not both, can be ommited.
*/

Instead of deleting pages in a document what you actually do is create a new document and only import the pages that you want to keep. 您实际执行的操作不是删除文档中的页面,而是创建新文档,而只导入要保留的页面。 Below is a full working WinForms app that does that (targetting iTextSharp 5.1.1.0). 下面是一个完整的WinForms应用程序(目标iTextSharp 5.1.1.0)。 The last parameter to the function removePagesFromPdf is an array of pages to keep. 函数removePagesFromPdf的最后一个参数是要保留的页面数组。

The code below works off of physical files but would be very easy to convert to something based on streams so that you don't have to write to disk if you don't want to. 下面的代码使用物理文件,但很容易转换为基于流的东西,这样你就不必写入磁盘了。

using System;
using System.ComponentModel;
using System.IO;
using System.Linq;
using System.Windows.Forms;
using iTextSharp.text.pdf;
using iTextSharp.text;


namespace Full_Profile1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            //The files that we are working with
            string sourceFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
            string sourceFile = Path.Combine(sourceFolder, "Test.pdf");
            string destFile = Path.Combine(sourceFolder, "TestOutput.pdf");

            //Remove all pages except 1,2,3,4 and 6
            removePagesFromPdf(sourceFile, destFile, 1, 2, 3, 4, 6);
            this.Close();
        }
        public void removePagesFromPdf(String sourceFile, String destinationFile, params int[] pagesToKeep)
        {
            //Used to pull individual pages from our source
            PdfReader r = new PdfReader(sourceFile);
            //Create our destination file
            using (FileStream fs = new FileStream(destinationFile, FileMode.Create, FileAccess.Write, FileShare.None))
            {
                using (Document doc = new Document())
                {
                    using (PdfWriter w = PdfWriter.GetInstance(doc, fs))
                    {
                        //Open the desitination for writing
                        doc.Open();
                        //Loop through each page that we want to keep
                        foreach (int page in pagesToKeep)
                        {
                            //Add a new blank page to destination document
                            doc.NewPage();
                            //Extract the given page from our reader and add it directly to the destination PDF
                            w.DirectContent.AddTemplate(w.GetImportedPage(r, page), 0, 0);
                        }
                        //Close our document
                        doc.Close();
                    }
                }
            }
        }
    }
}

Here is the code I use to copy all but the last page of an existing PDF. 这是我用来复制除现有PDF的最后一页之外的所有代码的代码。 Everything is in memory streams. 一切都在记忆流中。 The variable pdfByteArray is a byte[] of the original pdf obtained using ms.ToArray(). 变量pdfByteArray是使用ms.ToArray()获得的原始pdf的byte []。 pdfByteArray is overwritten with the new PDF. pdfByteArray被新PDF覆盖。

        PdfReader originalPDFReader = new PdfReader(pdfByteArray);

        using (MemoryStream msCopy = new MemoryStream())
        {
           using (Document docCopy = new Document())
           {
              using (PdfCopy copy = new PdfCopy(docCopy, msCopy))
              {
                 docCopy.Open();
                 for (int pageNum = 1; pageNum <= originalPDFReader.NumberOfPages - 1; pageNum ++)
                 {
                    copy.AddPage(copy.GetImportedPage(originalPDFReader, pageNum ));
                 }
                 docCopy.Close();
              }
           }

           pdfByteArray = msCopy.ToArray();

I know it's an old post, Simply I extend the @chris-haas solution to the next level.我知道这是一篇旧帖子,只是我将@chris-haas 解决方案扩展到了一个新的水平。

Delete the selected pages after that save them into the separate pdf file.删除选定的页面,然后将它们保存到单独的 pdf 文件中。

//ms is MemoryStream and fs is FileStream

ms.CopyTo(fs);

Save the Stream to a separate pdf file.将流保存到单独的 pdf 文件。 100% working without any error. 100% 工作,没有任何错误。

pageRange="5"

pageRange="2,15-20"

pageRange="1-5,15-20"

You can pass the pageRange vales like the above-given samples.您可以像上面给出的示例一样传递pageRange值。

private void DeletePagesNew(string pageRange, string SourcePdfPath, string OutputPdfPath, string Password = "")
{
    try
    {
        var pagesToDelete = new List<int>();

        if (pageRange.IndexOf(",") != -1)
        {
            var tmpHold = pageRange.Split(',');

            foreach (string nonconseq in tmpHold)
            {

                if (nonconseq.IndexOf("-") != -1)
                {
                    var rangeHold = nonconseq.Split('-');

                    for (int i = Convert.ToInt32(rangeHold[0]), loopTo = Convert.ToInt32(rangeHold[1]); i <= loopTo; i++)
                        pagesToDelete.Add(i);
                }
                else
                {
                    pagesToDelete.Add(Convert.ToInt32(nonconseq));
                }
            }
        }

        else if (pageRange.IndexOf("-") != -1)
        {
            var rangeHold = pageRange.Split('-');

            for (int i = Convert.ToInt32(rangeHold[0]), loopTo1 = Convert.ToInt32(rangeHold[1]); i <= loopTo1; i++)
                pagesToDelete.Add(i);
        }
        else
        {
            pagesToDelete.Add(Convert.ToInt32(pageRange));
        }

        var Reader = new PdfReader(SourcePdfPath);
        int[] pagesToKeep;
        pagesToKeep = Enumerable.Range(1, Reader.NumberOfPages).ToArray();

        using (var ms = new MemoryStream())
        {

            using (var fs = new FileStream(OutputPdfPath, FileMode.Create, FileAccess.Write, FileShare.None))
            {

                using (var doc = new Document())
                {

                    using (PdfWriter w = PdfWriter.GetInstance(doc, fs))
                    {
                        doc.Open();

                        foreach (int p in pagesToKeep)
                        {
                            if (pagesToDelete.FindIndex(s => s == p) != -1)
                            {
                                continue;
                            }

                            // doc.NewPage()
                            // w.DirectContent.AddTemplate(w.GetImportedPage(Reader, p), 0, 0)
                            // 
                            doc.SetPageSize(Reader.GetPageSize(p));
                            doc.NewPage();
                            PdfContentByte cb = w.DirectContent;
                            PdfImportedPage pageImport = w.GetImportedPage(Reader, p);
                            int rot = Reader.GetPageRotation(p);

                            if (rot == 90 || rot == 270)
                            {
                                cb.AddTemplate(pageImport, 0, -1.0f, 1.0f, 0, 0, Reader.GetPageSizeWithRotation(p).Height);
                            }
                            else
                            {
                                cb.AddTemplate(pageImport, 1.0f, 0, 0, 1.0f, 0, 0);
                            }

                            cb = default;
                            pageImport = default;
                            rot = default;
                        }

                        ms.CopyTo(fs);
                        fs.Flush();
                        doc.Close();
                    }
                }

            }
        }

        pagesToDelete = null;
        Reader.Close();
        Reader = default;
    }

    catch (Exception ex)
    {
        MessageBox.Show(ex.Message);

    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM