简体   繁体   English

在DOCX文件上使用OpenXML的常量NullReference异常

[英]Constant NullReference Exception using OpenXML on a DOCX file

I'm trying to parse through a lengthy file and remove sections that I don't want. 我正在尝试解析一个冗长的文件,并删除不需要的部分。 From research It appears that the OpenXml SDK was the easiest reference for manipulating and searching through the word doc. 从研究看来,OpenXml SDK是操纵和搜索doc文档的最简单参考。 Unfortunately, it's not always consistent because I keep getting NullReferenceExceptions when trying to assign nodes like to a run object. 不幸的是,它并不总是一致的,因为在尝试将节点分配给运行对象时,我不断收到NullReferenceExceptions Essentially, my program should go through the docx file and find the tags (ver 1) and then remove everything in between it and the closing tag (/ver 1). 本质上,我的程序应该遍历docx文件并找到标签(版本1),然后删除其与结束标签之间的所有内容(/版本1)。 this only seems to work for some section as other sections I get the NullReferenceException and I feel it has to do with the messy formatting that MS Word uses but I don't know. 这似乎只对某些部分有效,而其他部分却得到了NullReferenceException ,我认为这与MS Word使用的混乱格式有关,但我不知道。

Here's the code for a particular section if anyone could help I'd Appreciate it. 如果有人可以帮助,以下是特定部分的代码,我将不胜感激。

IEnumerable<OpenXmlElement> elem = main.Document.Body.Descendants().ToList();
foreach (OpenXmlElement elems in elem)
{
   if (elems is Text && elems.InnerText == s_Ver1)// s_Ver1 = "(Ver 1)"
   {
      Run run = (Run)elems.Parent;
      Paragraph p = (Paragraph)run.Parent;
      p.RemoveAllChildren();
      p.Remove();

      foreach (OpenXmlElement endelems in elem)
      {
         if (endelems is Text && elems.InnerText == e_Ver1)//e_Ver1 = "(/Ver1)"
         {
            run = (Run)endelems.Parent;
            p = (Paragraph)run.Parent;
            p.Remove();
            break;
         }

         else
         {
            Run d_Run = (Run)endelems.Parent;
            Paragraph d_p = (Paragraph)d_Run.Parent;
            d_p.RemoveAllChildren();
            d_p.Remove();*/

            try
            {
               endelems.Remove();
            }

            catch(Exception err)
            {
               MessageBox.Show(err.ToString());
            }
          }
       }
    }
}

Edit 编辑

try catch with in the code ( around the endelems.remove() ) 尝试在代码中捕捉(在endelems.remove()周围)

 System.InvalidOperationException: The Parent of this element is Null
 //it also says line 141 but I'm not sure how to get line numbering in vs2010

try catch error around entire thing 尝试抓住整个事情的错误

 System.NullReferenceException: Object reference not set to an instance of an object
 //line 114 which would be Paragraph p = (Paragraph)run.Parent; line

I am not quite sure what you are trying to do here, but... 我不确定您要在这里做什么,但是...

You get a static list of children from the body. 您会从身体中获得一个静态的孩子清单。

You iterate over possibly deleted children. 您遍历可能已删除的子级。 And then call remove a child that were already removed with RemoveAllChildren() . 然后调用remove一个已经通过RemoveAllChildren()删除的孩子。

Not to mention this faulty logic. 更不用说这种错误的逻辑了。

if (endelems is Text && elems.InnerText == e_Ver1)//e_Ver1 = "(/Ver1)"
{
    ...
else
{
    Run d_Run = (Run)endelems.Parent;
}

In the else clause, endelems will probably not have a parent that is a Run , since it probably wouldn't be a Text element. 在else子句中,endelems可能没有作为Run的父对象,因为它可能不是Text元素。

--- EDIT --- pseudocode -编辑-伪代码

IEnumerable<Text> elems = wd.MainDocumentPart.Document.Body.Descendants<Text>();
foreach (Text elem in elems) 
{

    if(elem.InnerText.Equals("Ver 1"))
    {
        IEnumerable<OpenXmlElement> afterelems = elem.ElementsAfter();
        foreach(OpenXmlElement openelem in afterelems)
        {
            if(openelem is Text && ((Text)openelem).InnerText.Equals("Ver 2"))
            {
                break;
            } else if(openelem is Text) {
                openelem.Remove();
            }
        }
        break;
    }

}

foreach (Run run in wd.MainDocumentPart.Document.Body.Descendants<Run>().Where(run => run.Descendants<Text>().Count() == 0 && run.Descendants<Break>().Count() == 0))
{
    run.Remove();
}

foreach (Paragraph par in wd.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where(par => par.Descendants<Run>().Count() == 0 && par.Descendants<Table>().Count() == 0))
{
    par.Remove();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM