简体   繁体   中英

Tagging individual pages of a PDF with ItextSharp C#

I am currently working with ITEXTSHARP 5.5.6.0

My goal is to add a Key to each page and have those persistent when I read the document again with another application. I want to be able to keep track of every page individually (the key is unique, and comes from another source).

This is my import/write code:

 using (PdfReader reader = new PdfReader(sourcePdfPath))
 {

        using (Document document = new Document(reader.GetPageSizeWithRotation(pageNumber)))
        {

            PdfCopy pdfCopyProvider = new PdfCopy(document, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
            pdfCopyProvider.SetTagged();
            pdfCopyProvider.PdfVersion = PdfWriter.VERSION_1_7;

            PdfImportedPage importedPage = pdfCopyProvider.GetImportedPage(reader, pageNumber, true);
            importedPage.SetAccessibleAttribute(PdfName.ALT, new PdfString("MYKEY"));
            pdfCopyProvider.AddPage(importedPage);               
        }
 }

This is my read code:

using (MemoryStream ms = new MemoryStream())
        {
            Document document = new Document();
            PdfCopy copy = new PdfCopy(document, ms);
            copy.SetTagged();
            document.Open();
            for (int i = 0; i < pdfs.Count; ++i)
            {
                var pdf = File.ReadAllBytes(pdfs[i]);
                PdfReader reader = new PdfReader(pdf);
                int n = reader.NumberOfPages;
                for (int page = 0; page < n; )
                {
                    var importPage = copy.GetImportedPage(reader, ++page, true);
                    var MyKey = importPage.GetAccessibleAttribute(PdfName.ALT);
                    if (MyKey != null)
                        //Do Something with KEY
                    copy.AddPage(importPage);
                }
            }
            document.Close();
            copy.Close();


            return ms.ToArray();
        }

I am trying to add an accessibility ALT text. Currently, I use that attribute on images, and all applications are set to leave those attributes untouched.

The problem is that when I add the attribute this way, save it to a PDF file, and then read it on another process, the attribute is no longer there.

I am open to other options, to resolve the problem of having a primary key per page, that i can assign, read and remove

I am trying to avoid adding a hidden field on each page.

I have little experience with iText programming or with c# so I'm ideal to answer your question :)

First of all, if all you want to do is mark a page and afterwards find it again, please do not use the accessibility features in the PDF. Accessibility is there for assistive devices, abusing those features isn't nice.

Especially because - if I understand correctly what you want to do - there is no need to do so. If you want to mark a page, you should look for the page dictionary, for example:

PdfReader reader = new iTextSharp.text.pdf.PdfReader(file_content);
PdfDictionary pageDict = reader.GetPageN(i);

Copied from: http://goobbe.com/questions/8099416/how-to-get-the-userunit-property-from-a-pdffile-using-itextsharp-pdfreader

Once you have that dict, you can insert your own private key in there:

public void put(PdfName key, PdfObject object);

The value you assign is up to you, but if you want to follow the rules, you have to use a second class PDF name as the key. This is a key that consists of your developer prefix - which should be registered so it is unique and a private part. For example a key could look like:

FICL:PageNumber

In that case "FICL" is your developer prefix and "PageNumber" is your identification of the data you are adding.

To register a developer prefix, see the Adobe web site, for example here: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdfregistry_v3.pdf

Hope this helps.

PS: If anyone here knows who actually owns the "FICL" prefix and where the letters come from, I'll buy you a beer :)

David's answer is correct and it's the answer that should be accepted. However, I'm adding an extra answer for the sake of completeness.

The OP's question is about adding an extra key to the page dictionary of existing pages in a PDF. If you want to add a key to a PDF that is built from scratch using iText, you can use the addPageDictEntry() method in PdfWriter . This will add an entry to the page dictionary of the next page object that is created by a PdfWriter instance.

It's something that could be automated by using page events, for instance if you want to give each page a unique ID by adding a custom entry to the page dictionary of each page that is created with iText.

(This doesn't answer the OP's question because he's not using a PdfWriter , but this answer could be useful for other people with the same question in the context of creating PDFs from scratch.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM