简体   繁体   English

在 PDF 中查找和替换字符串

[英]find and replace string in PDF

I'm searching for a way to replace the text in a pdf in C#.我正在寻找一种方法来替换 C# 中 pdf 中的文本。 The use case is we have a client that needs to sign a pdf and we want to pre populate a few of the fields before they download it.用例是我们有一个客户端需要对 pdf 进行签名,我们希望在他们下载之前预先填充一些字段。 Things like date, name, title, etc. I've found a few potential options like PDFSharp however I can't seem to find a way to search based on text.诸如日期、名称、标题等内容。我发现了一些潜在的选项,例如 PDFSharp,但是我似乎无法找到基于文本进行搜索的方法。

Resources I've found so far are:到目前为止我发现的资源是:

Find a word in PDF using PDFSharp . 使用 PDFSharp 在 PDF 中查找单词

https://forum.pdfsharp.net/viewtopic.php?p=4010 https://forum.pdfsharp.net/viewtopic.php?p=4010

However I wasn't able to get them working for my use case.但是,我无法让它们为我的用例工作。 Any help would be greatly appreciated.任何帮助将不胜感激。

UPDATE Here is the boiler plate code that I've been working with to try to do the search and replace:更新这是我一直在尝试进行搜索和替换的样板代码:

String toFind = 'client-title';
String toReplace = 'John Doe';
PdfSharp.Pdf.PdfDocument PDFDoc = PdfReader.Open("path/to/original/file.pdf", PdfDocumentOpenMode.Import);
PdfSharp.Pdf.PdfDocument PDFNewDoc = new PdfSharp.Pdf.PdfDocument();

for(int i = 0; i < PDFDoc.Pages.Count; i++)
{
    // Find toFind string and replace with toReplace string

    PDFNewDoc.AddPage(PDFDoc.Pages[i]);
}
PDFNewDoc.Save("path/to/new/file.pdf");

My sample below simply replaces the word 'Hello' with 'Hola'我下面的示例只是将单词“Hello”替换为“Hola”

class Program
    {
        static void Main(string[] args)
        {
            string originalPdf = @"C:\origPdf.pdf";

            CreatePdf(originalPdf);

            using (var doc = PdfReader.Open(originalPdf, PdfDocumentOpenMode.Modify))
            {
                var page = doc.Pages[0];
                var contents = ContentReader.ReadContent(page);

                ReplaceText(contents, "Hello", "Hola");
                page.Contents.ReplaceContent(contents);

                doc.Pages.Remove(page);
                doc.AddPage().Contents.ReplaceContent(contents);
               
                doc.Save(originalPdf);
            }

            Process.Start(originalPdf);

        }

        // Code from http://www.pdfsharp.net/wiki/HelloWorld-sample.ashx
        public static void CreatePdf(string filename)
        {
            // Create a new PDF document
            PdfDocument document = new PdfDocument();
            document.Info.Title = "Created with PDFsharp";

            // Create an empty page
            PdfPage page = document.AddPage();

            // Get an XGraphics object for drawing
            XGraphics gfx = XGraphics.FromPdfPage(page);

            // Create a font
            XFont font = new XFont("Verdana", 20, XFontStyle.BoldItalic, new XPdfFontOptions(PdfFontEncoding.WinAnsi));

            // Draw the text
            gfx.DrawString("Hello, World!", font, XBrushes.Black,
              new XRect(0, 0, page.Width, page.Height),
              XStringFormats.Center);

            // Save the document...
            document.Save(filename);
            // ...and start a viewer.
        }

        // Please refer to the pdf tech specs on what all entails in the content stream
        // https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
        public static void ReplaceText(CSequence contents, string searchText, string replaceText)
        {
            // Iterate thru each content items. Each item may or may not contain the entire
            // word if there are different stylings (ex: bold parts of the word) applied to a word.
            // So you may have to replace a character at a time.
            for (int i = 0; i < contents.Count; i++)
            {
                if (contents[i] is COperator)
                {
                    var cOp = contents[i] as COperator;
                    for (int j = 0; j < cOp.Operands.Count; j++)
                    {
                        if (cOp.OpCode.Name == OpCodeName.Tj.ToString() ||
                            cOp.OpCode.Name == OpCodeName.TJ.ToString())
                        {
                            if (cOp.Operands[j] is CString)
                            {
                                var cString = cOp.Operands[j] as CString;
                                if (cString.Value.Contains(searchText))
                                {
                                    cString.Value = cString.Value.Replace(searchText, replaceText);
                                }

                            }
                        }
                    }


                }
            }


        }
    }```

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM