简体   繁体   English

如何删除空段落元素?

[英]How can I remove empty paragraph elements?

I am trying to remove paragraphs that contains "{Some Text}" .我正在尝试删除包含"{Some Text}"段落。 The method below does just that, but I noticed that after I remove the paragraphs, there are empty paragraph elements left over.下面的方法就是这样做的,但我注意到在我删除段落后,剩下空的段落元素。

How can I remove <w:p /> elements programmatically?如何以编程方式删除<w:p />元素?

Below is what I initially used to remove paragraphs.下面是我最初用来删除段落的内容。

 using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(file, true))
        {
            MainDocumentPart mainPart = wordDoc.MainDocumentPart;
            Document D = mainPart.Document;

            foreach (Paragraph P in D.Descendants<Paragraph>())
            {
                if (P.InnerText.Contains("{SomeText}"))
                {
                    P.RemoveAllChildren();
                    //P.Remove();   //doesn't remove
                }
            }
            D.Save();
        }

This is how the document.xml looks like afterwords:这是 document.xml 后记的样子:

<w:p />
<w:p />
<w:p />
<w:p />
<w:p />
<w:p />
<w:p />

The problem here:这里的问题:

        foreach (Paragraph P in D.Descendants<Paragraph>())
        {
            if (P.InnerText.Contains("{SomeText}"))
            {
                P.Remove();   //doesn't remove
            }
        }

Is that you are trying to remove an item from the collection while you are still iterating it.您是否正在尝试从集合中删除项目,而您仍在对其进行迭代。 For some strange reason, the OpenXML SDK doesn't actually throw an exception here, it just silently quits the foreach loop.出于某种奇怪的原因,OpenXML SDK 实际上并没有在这里抛出异常,它只是默默地退出foreach循环。 Attaching a debugger and stepping through will show you that.附加调试器并逐步完成将向您展示这一点。 The fix is simple:修复很简单:

        foreach (Paragraph P in D.Descendants<Paragraph>().ToList())
        {
            if (P.InnerText.Contains("{SomeText}"))
            {
                P.Remove();   //will now remove
            }
        }

By adding ToList() you are copying (shallow copy) the paragraphs to a separate list and iterating through that list.通过添加ToList()您将段落复制(浅复制)到一个单独的列表并遍历该列表。 Now when you remove a paragraph it is removed from the D.Descendants<Paragraph>() collection, but not from your list and the iteration will continue.现在,当您删除一个段落时,它会从D.Descendants<Paragraph>()集合中删除,但不会从您的列表中删除,迭代将继续。

The answer above helped me to create following code snippet which deletes paragraphs from begin to end (excluding begin and end).上面的答案帮助我创建了以下代码片段,从头到尾删除段落(不包括开始和结束)。 This approach is quite handy when you must use a template as input, but you do not want some parts of it in the output.当您必须使用模板作为输入,但又不想在输出中使用模板的某些部分时,这种方法非常方便。

public void RemoveParagraphsFromDocument(string begin, string end)
{
    using (var wordDoc = WordprocessingDocument.Open(OutputPath, true))
    {
        var mainPart = wordDoc.MainDocumentPart;
        var doc = mainPart.Document;
        var paragraphs = doc.Descendants<Paragraph>().ToList();
        var beginIndex = paragraphs.FindIndex(par => par.InnerText.Equals(begin));
        var endIndex = paragraphs.FindIndex(par => par.InnerText.Equals(end));

        for (var i = beginIndex + 1; i < endIndex; i++)
        {
            paragraphs[i].Remove();
        }

        doc.Save();
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM