简体   繁体   English

如何使用 OpenXML Sdk 替换段落文本

[英]How to replace an Paragraph's text using OpenXML Sdk

I am parsing some Openxml word documents using the .Net OpenXml SDK 2.0.我正在使用 .Net OpenXml SDK 2.0 解析一些 Openxml word 文档。 I need to replace certain sentences with other sentences as part of the processing.作为处理的一部分,我需要用其他句子替换某些句子。 While iterating over the paragraphs, I know when I've found something I need to replace, but I am stumped as to how I can replace it.在迭代段落时,我知道何时找到了需要替换的内容,但我不知道如何替换它。

For example, lets say I need to replace the sentence "a contract exclusively for construction work that is not building work."例如,假设我需要替换句子"a contract exclusively for construction work that is not building work." with a html snippet to a Sharepoint Reusable content below.将 html 片段添加到下面的 Sharepoint 可重用内容。

<span class="ms-rtestate-read ms-reusableTextView" contentEditable="false" id="__publishingReusableFragment" fragmentid="/Sites/Sandbox/ReusableContent/132_.000" >a contract exclusively for construction work that is not building work.</span>

PS: I got the docx to Html conversion worked out using xslt, so that is kind of not a problem at this stage PS:我使用 xslt 完成了 docx 到 Html 的转换,所以在这个阶段这不是问题

The InnerText property of the Paragraph node gives me the proper text, but the inner text property itself is not settable. Paragraph 节点的 InnerText 属性为我提供了正确的文本,但内部文本属性本身不可设置。 So Regex.Match(currentParagraph.InnerText, currentString).Success returns true and tells me that the current paragraph contains the text I want.所以Regex.Match(currentParagraph.InnerText, currentString).Success返回 true 并告诉我当前段落包含我想要的文本。

As I said, InnerText itself is not settable, so I tried created a new paragraph using outerxml is given below.正如我所说,InnerText 本身是不可设置的,所以我尝试使用下面给出的 outerxml 创建一个新段落。

string modifiedOuterxml = Regex.Replace(currentParagraph.OuterXml, currentString, reusableContentString);
OpenXmlElement parent = currentParagraph.Parent;
Paragraph modifiedParagraph = new Paragraph(modifiedOuterxml);
parent.ReplaceChild<Paragraph>(modifiedParagraph, currentParagraph);

Even though I am not too concerned about the formatting at this level and it doesn't seem to have any, the outerXML seems to have extra elements that defeat the regex.尽管我不太关心这个级别的格式,它似乎没有任何格式,但外层 XML 似乎有额外的元素来打败正则表达式。

..."16" /><w:lang w:val="en-AU" /></w:rPr><w:t>a</w:t></w:r><w:proofErr w:type="gramEnd" /> <w:rw:rsidRPr="00C73B58"><w:rPr><w:sz w:val="16" /><w:szCs w:val="16" /><w:lang w:val="en-AU" /></w:rPr><w:t xml:space="preserve"> contract exclusively for construction work that is not building work.</w:t></w:r></w:p>

So in summary, how would I replace the text in a Paragraph of OpenXml with other text.所以总而言之,我将如何用其他文本替换 OpenXml 段落中的文本。 Even at the expense of losing some of the formatting.即使以丢失一些格式为代价。

Fixed it myself.自己修好了。 The key was to remove all the runs and create new runs in the current paragraph关键是删除所有运行并在当前段落中创建新运行

string modifiedString = Regex.Replace(currentParagraph.InnerText, currentString, reusableContentString);
currentParagraph.RemoveAllChildren<Run>();
currentParagraph.AppendChild<Run>(new Run(new Text(modifiedString)));

All paragraphs have a text element inside so you just have to find the text element and update its text, for example:所有段落内部都有一个 text 元素,因此您只需找到 text 元素并更新其文本,例如:

var text = part.RootElement.Descendants<Text>().FirstOrDefault(e=>e.Text == "a contract exclusively for construction work that is not building work.");
if(text != null)
{
    text.Text = "New text here";
}
mainPart.Document.Save();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM