简体   繁体   English

如何在 C# 中使用 Open XML 从 word 文档中找到确切的单词?

[英]How to find exact word from word document using Open XML in C#?

I need to find exact word which I want to replace from word document using Open XML in C#.我需要在 C# 中使用 Open XML 从 word 文档中找到要替换的确切单词。 the purpose of replacing the personal details of user with some special character so that its not visible to reader.目的是用一些特殊字符替换用户的个人详细信息,使其对读者不可见。

For an example, the user has address mentioned in his form, which is stored in database he also has one word document uploaded, the word document also contain following type of string which matches his address.例如,用户在他的表单中提到了地址,该地址存储在数据库中,他还上传了一个 word 文档,该 word 文档还包含以下类型的与其地址匹配的字符串。 my purpose is to match the address with ###我的目的是将地址与### 匹配

sign so that other users cant see the address.签名,以便其他用户无法看到该地址。 eg例如

 "422, Plot no. 1000/A, The Moon Residency II, Shree Nagrik Co. Op. Society, Sardarnagar, Ahmedabad.

Looking for an opportunity that surpasses in making me a personality that influences the masses and that too effectively.寻找一个超越的机会,让我成为一个能够影响大众并且非常有效的个性。 Organizationally, I would strive to work at a single在组织上,我会努力在一个

place with no professional switches being made and would love to work in an environment that demands constant evolution with variable domains incorporated to deal没有进行专业转换的地方,并且喜欢在需要不断进化的环境中工作,并结合可变域来处理

with."与。”

I want to replace "Co", "Op" with "#" sign.我想用“#”符号替换“Co”、“Op”。 My output would be this:我的输出是这样的:

"422, Plot no. 1000/A, The Moon Residency II, Shree Nagrik #. #. Society, Sardarnagar, Ahmedabad.

Looking for an opportunity that surpasses in making me a personality that influences the masses and that too effectively.寻找一个超越的机会,让我成为一个能够影响大众并且非常有效的个性。 Organizationally, I would strive to work at a single在组织上,我会努力在一个

place with no professional switches being made and would love to work in an environment that demands constant evolution with variable domains incorporated to deal没有进行专业转换的地方,并且喜欢在需要不断进化的环境中工作,并结合可变域来处理

with.与。 "

Now i have several questions 1. How can i search for whole word, right now my code replaces opportunity word with ##portunity since this word has Op.现在我有几个问题 1. 我如何搜索整个词,现在我的代码用 ##portunity 替换了机会词,因为这个词有 Op。 Same with Constant it replaces with ##nstant.与 Constant 相同,它替换为 ##nstant。 I need to replace if the whole word matches.如果整个单词匹配,我需要替换。

  1. how can i match the whole line in the word or may be the whole address, the address should be replace as whole, if not possible, it should replace 70-80%.我如何匹配单词中的整行或可能是整个地址,地址应整体替换,如果不可能,则应替换70-80%。

Currently my code is as bellow to replace word into word file.目前我的代码如下将单词替换为单词文件。

MemoryStream m = new System.IO.MemoryStream();
//strResumeName contain my word file url
m = objBlob.GetResumeFile(strResumeName);

   using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(m, true))
  {
            body = wordDoc.MainDocumentPart.Document.Body;
            colT = body.Descendants<DocumentFormat.OpenXml.Wordprocessing.Text>();
            foreach (DocumentFormat.OpenXml.Wordprocessing.Text c in colT)
              {
                 if (c.InnerText.Trim() != String.Empty)
                     {
                       sb.Append(c.InnerText.Trim() + " ");
                     }
              }
               string[] strParts = sb.ToString().Split(' ');
               HyperLinkList = HyperLinksList(wordDoc);
               redactionTags = GetReductionstrings(strParts);
}
 using (Novacode.DocX document = Novacode.DocX.Load(m))
 {
//objCandidateLogin.Address contain my address
  if (!String.IsNullOrEmpty(objCandidateLogin.Address))
  {
     string[] strParts = objCandidateLogin.Address.Replace(",", " ").Split(' ');
     for (int I = 0; I <= strParts.Length - 1; I++)
       {
            if (strParts[I].Trim().Length > 1)
             {
                document.ReplaceText(strParts[I].Trim(), "#############", false, RegexOptions.IgnoreCase);
              }
          }

   }
}

You're using OpenXML with Novacode, you should consider using just OpenXML.您将 OpenXML 与 Novacode 一起使用,您应该考虑仅使用 OpenXML。

About the replacing text with "#".关于用“#”替换文本。 You will have to iterate through all paragraphs in the word document and check the Text elements within them to see if the text you're looking for exists and if it exists you can replace the text.您必须遍历 word 文档中的所有段落并检查其中的 Text 元素,以查看您要查找的文本是否存在,如果存在,您可以替换该文本。

Nothing else to it.没有别的了。 Hope this helps.希望这会有所帮助。

IEnumerable<Paragraph> paragraphs = document.Body.Descendants<Paragraph>();
foreach(Paragraph para in paragraphs)
{
    String text = para.Descendents<Text>().FirstOrDefault();
    //Code to replace text with "#"
}

I've written this code out of memory, but if you proceed on these lines, you will find your solution.我已经把这段代码写到了内存中,但如果你继续这些行,你会找到你的解决方案。

You can use the method TextReplacer in PowerTools for Open XML to accomplish what you want.您可以使用PowerTools for Open XML 中的TextReplacer 方法来完成您想要的操作。 Then you can do something like this:然后你可以做这样的事情:

using DocumentFormat.OpenXml.Packaging;
using OpenXmlPowerTools;
using System.IO;

namespace SearchAndReplace
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            using (WordprocessingDocument doc = WordprocessingDocument.Open("Test01.docx", true))
                TextReplacer.SearchAndReplace(wordDoc:doc, search:"the", replace:"this", matchCase:false);
        }
    }
}

To install the Nuget package for OpenXml Power Tools, run the following command in the Package Manager Console要为 OpenXml Power Tools 安装 Nuget 包,请在包管理器控制台中运行以下命令

PM > Install-Package OpenXmlPowerTools PM > 安装包 OpenXmlPowerTools

There is an OpenXML Power Tools class for searc and replace text in OpenXML Document. OpenXML 文档中有一个用于搜索和替换文本的 OpenXML Power Tools 类。 Get it from here.从这里得到它。 http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/08/04/introducing-textreplacer-a-new-class-for-powertools-for-open-xml.aspx http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2011/08/04/introducing-textreplacer-a-new-class-for-powertools-for-open-xml.aspx

Hope this helps.希望这会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM