简体   繁体   中英

find and replace string in PDF

I'm searching for a way to replace the text in a pdf in C#. The use case is we have a client that needs to sign a pdf and we want to pre populate a few of the fields before they download it. Things like date, name, title, etc. I've found a few potential options like PDFSharp however I can't seem to find a way to search based on text.

Resources I've found so far are:

Find a word in PDF using PDFSharp .

https://forum.pdfsharp.net/viewtopic.php?p=4010

However I wasn't able to get them working for my use case. Any help would be greatly appreciated.

UPDATE Here is the boiler plate code that I've been working with to try to do the search and replace:

String toFind = 'client-title';
String toReplace = 'John Doe';
PdfSharp.Pdf.PdfDocument PDFDoc = PdfReader.Open("path/to/original/file.pdf", PdfDocumentOpenMode.Import);
PdfSharp.Pdf.PdfDocument PDFNewDoc = new PdfSharp.Pdf.PdfDocument();

for(int i = 0; i < PDFDoc.Pages.Count; i++)
{
    // Find toFind string and replace with toReplace string

    PDFNewDoc.AddPage(PDFDoc.Pages[i]);
}
PDFNewDoc.Save("path/to/new/file.pdf");

My sample below simply replaces the word 'Hello' with 'Hola'

class Program
    {
        static void Main(string[] args)
        {
            string originalPdf = @"C:\origPdf.pdf";

            CreatePdf(originalPdf);

            using (var doc = PdfReader.Open(originalPdf, PdfDocumentOpenMode.Modify))
            {
                var page = doc.Pages[0];
                var contents = ContentReader.ReadContent(page);

                ReplaceText(contents, "Hello", "Hola");
                page.Contents.ReplaceContent(contents);

                doc.Pages.Remove(page);
                doc.AddPage().Contents.ReplaceContent(contents);
               
                doc.Save(originalPdf);
            }

            Process.Start(originalPdf);

        }

        // Code from http://www.pdfsharp.net/wiki/HelloWorld-sample.ashx
        public static void CreatePdf(string filename)
        {
            // Create a new PDF document
            PdfDocument document = new PdfDocument();
            document.Info.Title = "Created with PDFsharp";

            // Create an empty page
            PdfPage page = document.AddPage();

            // Get an XGraphics object for drawing
            XGraphics gfx = XGraphics.FromPdfPage(page);

            // Create a font
            XFont font = new XFont("Verdana", 20, XFontStyle.BoldItalic, new XPdfFontOptions(PdfFontEncoding.WinAnsi));

            // Draw the text
            gfx.DrawString("Hello, World!", font, XBrushes.Black,
              new XRect(0, 0, page.Width, page.Height),
              XStringFormats.Center);

            // Save the document...
            document.Save(filename);
            // ...and start a viewer.
        }

        // Please refer to the pdf tech specs on what all entails in the content stream
        // https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
        public static void ReplaceText(CSequence contents, string searchText, string replaceText)
        {
            // Iterate thru each content items. Each item may or may not contain the entire
            // word if there are different stylings (ex: bold parts of the word) applied to a word.
            // So you may have to replace a character at a time.
            for (int i = 0; i < contents.Count; i++)
            {
                if (contents[i] is COperator)
                {
                    var cOp = contents[i] as COperator;
                    for (int j = 0; j < cOp.Operands.Count; j++)
                    {
                        if (cOp.OpCode.Name == OpCodeName.Tj.ToString() ||
                            cOp.OpCode.Name == OpCodeName.TJ.ToString())
                        {
                            if (cOp.Operands[j] is CString)
                            {
                                var cString = cOp.Operands[j] as CString;
                                if (cString.Value.Contains(searchText))
                                {
                                    cString.Value = cString.Value.Replace(searchText, replaceText);
                                }

                            }
                        }
                    }


                }
            }


        }
    }```

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM