简体   繁体   English

查找并编号连续的XML元素

[英]Find and number consecutive XML elements

I have an XML document that looks kinda like this: 我有一个看起来像这样的XML文档:

<root>
    Maybe some text
    <thing>thing can have text</thing>
    <thing>it can even be on multiple
    lines
    </thing>
    <thing>a third thing</thing>
    This text resets the numbering
    <thing>this thing is not part of the above list and should have number 1</thing>
    <some-element-not-thing>Also resets numbering</some-element-not-thing>
    <thing>this thing should also have number 1<thing/>
</root>

I need to number the <thing> s when they come consecutively, by giving each an attribute called "number". 我需要给<thing>连续编号,方法是给每个赋予一个称为“ number”的属性。 That is, my desired results is: 也就是说,我想要的结果是:

<root>
    Maybe some text
    <thing number="1">thing can have text</thing>
    <thing number="2">it can even be on multiple
    lines
    </thing>
    <thing number="3">a third thing</thing>
    This text resets the numbering
    <thing number="1">this thing is not part of the above list and should have number 1</thing>
    <some-element-not-thing>Also resets numbering</some-element-not-thing>
    <thing number="1">this thing should also have number 1<thing/>
</root>

How would I approach something like this? 我将如何处理这样的事情? I can't see a way to find text between elements in XmlDocument (but it does let me enumerate elements by order, so I can reset numbering when I encounter something that is not <thing> ), and I am not sure LINQ to XML allows me to get text between elements either, as it will only yield elements or descendants, neither of which represent the "loose text". 我看不到在XmlDocument中的元素之间查找文本的方法(但是它确实允许我按顺序枚举元素,所以当遇到非<thing>东西时,我可以重置编号),而且我不确定LINQ to XML允许我在元素之间获取文本,因为它只会产生元素或后代,都不代表“松散的文本”。 Perhaps this "loose text" is bad (but apparently parse-able) XML? 也许这种“松散的文本”是不好的(但显然可以解析)的XML?

EDIT: I completely misunderstood my own problem. 编辑:我完全误解了我自己的问题。 Apparently there is no text between the elements, it was the result of an error I fixed afterwards. 元素之间显然没有文本,这是我后来修复的错误的结果。 The solution I ended up using was just enumerating the nodes and altering their attributes that way (using XML Document and ignoring whitespace), similar to what was suggested below. 我最终使用的解决方案只是枚举节点并以此方式更改其属性(使用XML Document并忽略空格),类似于下面的建议。 I apologize for not turning this question around in my head more and/or spending more time researching. 抱歉,我没有在脑海中反复提出这个问题和/或花费了更多时间进行研究。 If people think this question does not contribute to SO adequately I will not mind deleting it. 如果人们认为这个问题没有对SO做出足够的贡献,我将不介意删除它。

As always, it would be helpful if you provided what you've already tried before asking questions. 与往常一样,如果您在提出问题之前提供了已经尝试过的内容,将会很有帮助。 There are lots of blog posts and questions about parsing and manipulating XML. 有许多博客文章以及有关解析和操作XML的问题。

As a start, I would parse using LINQ to XML. 首先,我将使用LINQ to XML进行解析。 Then all you have to do is loop through the nodes below the root element, assigning each thing element an incrementing number. 然后,您要做的就是遍历根元素下面的节点,为每个thing元素分配一个递增的数字。 This counter is reset when the next element is not a thing and not whitespace: 当下一个元素不是thing并且不是空格时,将重置此计数器:

var doc = XDocument.Parse(xml, LoadOptions.PreserveWhitespace);

var i = 0;

foreach (var node in doc.Root.Nodes())
{
    var element = node as XElement;
    var text = node as XText;

    var isThing = element != null && element.Name == "thing";
    var isWhitespace = text != null && string.IsNullOrWhiteSpace(text.Value);

    if (isThing)
    {
        element.Add(new XAttribute("number", ++i));
    }
    else if (!isWhitespace)
    {
        i = 0;
    }
}

var result = doc.ToString();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM