简体   繁体   English

如何在c#中的XML中找到特定元素之前的元素?

[英]How do I find out the elements before a particular element in XML in c#?

I have an XML in the following format: 我有以下格式的XML:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE repub SYSTEM "C:\repub\Repub_V1.dtd">
<?xml-stylesheet href="C:\repub\repub.xsl" type="text/xsl"?>
<repubold>
    <head>
        <title>xxx</title>
    </head>
    <body>
        <sec>
            <title>First Title</title>
            <break name="1-1"/>
            <pps>This is an invalid text.</pps>
            <h1>
                <page num="1"/>First Heading
            </h1>
            <bl>This is another text</bl>
            <fig>
                <img src="images/img_1-1.jpg" alt=""/>
                <fc>This is a caption</fc>
            </fig>
            <p>
                <bold>This</bold> again
                <br/> is
                <br/>
                <bold> a 
                    <br/>paragraph
                </bold>
            </p>
        </sec>
        <sec>
            <title>Second Title</title>
            <break name="2-1"/>
            <h1>
                <page num="1"/>Second Heading
            </h1>
            <bl>This is another text</bl>
            <fig>
                <img src="images/img_2-1.jpg" alt=""/>
                <fc>This is a caption</fc>
                <cr>This is a credit</cr>
            </fig>
            <p>This is a paragraph</p>
        </sec>
        <sec>
            <title>First Title</title>
            <break name="3-1"/>
            <h1>
                <page num="1"/>Third Heading
            </h1>
            <bl>This is another text</bl>
            <fig>
                <img src="images/img_3-1.jpg" alt=""/>
                <fc>This is a caption</fc>
            </fig>
            <p>This is a paragraph</p>
        </sec>
        <sec>
            <title>Third Title</title>
            <break name="4-1"/>
            <h1>
                <page num="1"/>Fourth Heading
            </h1>
            <bl>This is another text</bl>
            <p>This is a paragraph</p>
            <fig>
                <img src="images/img_4-1.jpg" alt=""/>
                <fc>This is a caption</fc>
                <cr>This is a credit</cr>
            </fig>
            <break name="5-1"/>
            <h1>
                <page num="1"/>Fifth Heading
            </h1>
            <bl>This is another text</bl>
            <fig>
                <img src="images/img_5-1.jpg" alt=""/>
                <fc>This is a caption</fc>
                <cr>This is a credit</cr>
            </fig>
            <p>This is a paragraph</p>
        </sec>
    </body>
</repubold>

In this, all the <break> tags are followed by <h1> . 在此,所有<break>标记后均是<h1> So, I want to check the elements before <h1> , if any. 因此,我想检查<h1>之前的元素(如果有)。 If it is not <psf> then it will show an error. 如果不是<psf> ,它将显示错误。 Because I want that <psf> is the only acceptable tag between <break> and <h1> . 因为我希望<psf><break><h1>之间唯一可接受的标签。 It can be <psf> or nothing, but if there is any other <xyz> tag, then it will show an error. 它可以是<psf>或什么都不是,但是如果还有其他<xyz>标记,则它将显示错误。

Please help. 请帮忙。

I have tried this, but the code is not working: 我已经尝试过了,但是代码不起作用:

var pagetag = xdoc.Descendants("break").Descendants("h1")
.Where(br => br.ElementsBeforeSelf("h1") != new XElement("psf") ||                                                                 
br.ElementsBeforeSelf("h1") != new XElement("break"))
.Select(br => br.Attribute("name").Value.Trim())
.Aggregate((a, b) => a + ", " + b);

MessageBox.Show("The following articles have invalid tags before <h1>: " + pagetag);

The first problem is that ElementsBeforeSelf() returns a sequence of elements, but you're checking whether that sequence is equal to a single XElement - and comparing them by reference using != . 第一个问题是ElementsBeforeSelf()返回一个元素序列 ,但是您正在检查该序列是否等于单个XElement并使用!=通过引用对其进行比较。

You're also asking for the descendants of break elements - and there aren't any. 您还要求获得break元素的后代-没有任何元素。 I think you just want all the h1 elements. 我认为您只需要所有h1元素。

To clarify your requirement, I think you're trying to find all the h1 elements, where the last sibling element before the h1 is neither break nor psf . 为了阐明您的要求,我认为您正在尝试查找所有h1元素,其中h1之前的最后一个同级元素既不是break也不是psf For each of those elements, you want to find the latest break element before the h1 (if there is one) and report the name attribute. 对于每个元素,您都想在h1之前找到最新的break元素(如果有的话),并报告name属性。

Assuming that's the case, here's some code which I believe does what you want, with comments explaining it: 假设是这种情况,下面是一些我认为可以满足您需要的代码,并附有解释说明的注释:

using System;
using System.Linq;
using System.Xml.Linq;

public class Test
{
    public static void Main()
    {
        var xdoc = XDocument.Load("test.xml");
        XName brName = "break";
        XName psfName = "psf";

        var invalidNames = 
            from h1 in xdoc.Descendants("h1")
            // Find the last sibling element before the h1
            let previous = h1.ElementsBeforeSelf().LastOrDefault()
            // It's invalid if there isn't a previous element, or it has
            // a name other than break or psf
            where previous?.Name != brName && previous?.Name != psfName
            // Get the name to report, handling the case where there's
            // no previous break or no "name" attribute
            select ((string) h1.ElementsBeforeSelf(brName).LastOrDefault()?.Attribute("name")) ?? "(no named break)";

        Console.WriteLine(string.Join(", ", invalidNames));
    }
}

It has a bit of a flaw, in that if an <h1> is invalid, but has no immediate <break> predecessor, it will look back as far as the earlier one to find a name... so if you remove the <break name="5-1"/> element for example, it'll report the name of "4-1" as being invalid, as that's the last break element before the h1 that was after 5-1. 它有一个缺陷,即如果<h1>无效,但没有立即的 <break>前任,它将回溯到较早的前一个以查找名称...因此,如果删除<break name="5-1"/>例如, <break name="5-1"/>元素,它将报告名称“ 4-1”无效,因为这 5-1之后的h1之前的最后一个break元素。 I don't know how important that is to you. 我不知道那对你有多重要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM