简体   繁体   中英

Merging PDFs and remove blank space with ITextSharp

I have a problem when I'm working with image PDF files (PDF file with image only, no text) There are two PDF files img1, img2 and I want to combine two of them into one A4 page PDF file.

I have tried below code.

string Img1 = "C:/temp/image1.pdf";
string Img2 = "C:/temp/image2.pdf";
string MergedFile = "C:/temp/Combo.pdf";

//Create our PDF readers
PdfReader r1 = new PdfReader(Img1);
PdfReader r2 = new PdfReader(Img2);

//Our new page size, an A3 in landscape mode
iTextSharp.text.Rectangle NewPageSize = PageSize.A3.Rotate();

using (FileStream fs = new FileStream(MergedFile, FileMode.Create, 
                                  FileAccess.Write, FileShare.None))
{
    //Create our document without margins
    using (Document doc = new Document(NewPageSize, 0, 0, 0, 0))
    {
        using (PdfWriter w = PdfWriter.GetInstance(doc, fs))
        {
            doc.Open();
            //Get our imported pages
            PdfImportedPage imp1 = w.GetImportedPage(r1, 1);
            PdfImportedPage imp2 = w.GetImportedPage(r2, 1);
            //Add them to our merged document at specific X/Y coords
            **w.DirectContent.AddTemplate(imp1, 0, 0);
            w.DirectContent.AddTemplate(imp2, 0, -350);**
            doc.Close();
        }
    }
}
r1.Close();
r2.Close();

So when i execute above code, because i have mentioned the y coord , it will combine pdf and two images will be on one page only.

BUt i don't want to do that

Here i am just giving example of two images,but in actual there are more than 20 images (converted into PDFs).

So depending on the image size, it should combine files. i can not give fix y coord for each n every file

Can anyone please help me to combine multiple PDF into single with no blank space..?

在此处输入图片说明

Structurally, here is what you want to do:

  • Allocate a new page of the "right" size
  • Merge the content streams of the pages
  • Merge the resources of the pages
  • Adjust all the annotations (if any)

The first step is easy, the rest, the second is easy, the third not so much (and will have the side effect of complicating step 2). I'll let you know ahead of time that I lied to you about the order.

Merging the content streams will be straight forward. What you will want to do is a four step process (I'll inject here that I know PDF very well, but iTextSharp not too well):

  1. Insert a gsave operator (q)
  2. Insert a transform operator (cm) to transform to the location where you want content to appear. In you case it will be 1 0 0 1 XY cm
  3. Copy the content streams from the current page
  4. Insert a grestore operator (Q)

To merge the resources, you have to look at your newly created page's resources and for the current page do one of three things for each resource in each class of resource in a PDF page (XObject, Font, ColorSpace, ExtGState, Pattern, Shading, ProcSet - although for procset, you could set each procset to be the entire suite and do no harm):

  1. If the resource exists in the newly created page, but under a different name, mark it as renamed.
  2. If the resource does not exist in the newly created page and there is no resource with the same name, copy it in.
  3. If the resource does not exist in the newly created page and there is a name conflict, rename the resource to a synthetic name not in the newly create page and copy it in.

Now to get back to my lie. In the resource merging, you will likely need a map built for the current page that maps old resource name to new resource name. When in the process of copying the content stream from one to the next, you will need to map all resource names referenced in the content stream to the new names built in the resource merge step.

To Adjust annotations, you will have to move them to their new location by adjusting the Rect property in each. You will also need to reset the /Parent property. For any of the text markup annotations, you will need to adjust the Quads.

Now, here is where the works will get gummed up in all of that. If a page is rotated, this will not work. If a page has a crop box, you will have to look at it and adjust the clipping region to simulate the crop. If the page is rotated and has Text annotations, this will need to attention to annotation flags to ensure that the aspect ratio is correct. If the document has link annotations on any of the pages with GoTo actions/destinations, you will need to adjust these.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM