简体   繁体   English

如何解析具有相同标签名称的嵌套 xml 标签

[英]How to parse nested xml tags with the same tag name

I have a not specified number of nested categories, which contain items:我有未指定数量的嵌套类别,其中包含项目:

<categories>
    <category>abc
        <category>cde
            <item>someid</item>
            <item>someid</item>
            <item>someid</item>
            <item>someid</item>
        </category>
    </category>
<category>xyz
   <category>zwd
       <category>hgw
           <item>someid</item>
...

The result should be a list of items which are in the most deeply nested category (cde or hgw).结果应该是嵌套最深的类别(cde 或 hgw)中的项目列表。 Tricky is that there can be more than two level of nesting of categories and I want to save each parent category for the child category.棘手的是,类别的嵌套可以有两个以上级别,我想为子类别保存每个父类别。

I already did some xml parsing with Jackson XmlMapper and ObjectMapper, but this use case seems out of reach.我已经用 Jackson XmlMapper 和 ObjectMapper 做了一些 xml 解析,但是这个用例似乎遥不可及。 So I tried it with javax xml parser but gave up, because the code looks horrible and is hardly readable.所以我用 javax xml 解析器尝试了它但放弃了,因为代码看起来很糟糕而且几乎不可读。

Any idea how to solve this in a more elegant way?知道如何以更优雅的方式解决这个问题吗?

If the task is to quickly pull some values from the xml, then I would use jsoup.如果任务是从 xml 中快速提取一些值,那么我会使用 jsoup。 Jsoup is actually an html parser, but is also able to parse xml. Jsoup实际上是一个 html 解析器,但也能够解析 xml。 I'm not sure if jsoup can also validate xml schema and handle namespaces and and ... which is possible with other parsers.我不确定 jsoup 是否还可以验证 xml 模式并处理命名空间和……这在其他解析器中是可能的。 But to read a few values jsoup is usually enough for me.但是读取一些值 jsoup 通常对我来说就足够了。 If you want to take a look at the Jsoup cookbook or the selector syntax如果您想查看Jsoup 食谱选择器语法

Maven:马文:

<!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <version>1.13.1</version>
</dependency>

Using jsoup your code could look something like:使用 jsoup 您的代码可能如下所示:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.parser.Parser;
import org.jsoup.select.Elements;

public class Example {


    public static void main(String[] args) {
        String xml = "<categories>\n"
                + "    <category>abc\n"
                + "        <category>cde\n"
                + "            <item>someid_1</item>\n"
                + "            <item>someid_2</item>\n"
                + "            <item>someid_3</item>\n"
                + "            <item>someid_4</item>\n"
                + "        </category>\n"
                + "    </category>\n"
                + "    <category>xyz\n"
                + "       <category>zwd\n"
                + "          <category>hgw\n"
                + "             <item>someid_5</item>\n"
                + "          </category>\n"
                + "       </category>\n"
                + "    </category>\n"
                + " </categories>";

        Document doc = Jsoup.parse(xml, "", Parser.xmlParser());

        //if you are interested in Items only
        Elements items = doc.select("category > item");
        items.forEach(i -> {
            System.out.println("Parent text: " +i.parent().ownText());
            System.out.println("Item text: "+ i.text());
            System.out.println();
        });


        //if you are interested in categories having at least one direct item element
        Elements categories = doc.select("category:has(> item)");
        categories.forEach(c -> {
            System.out.println(c.ownText());
            Elements children = c.children();
            children.forEach(ch -> {
                System.out.println(ch.text());
            });
            System.out.println();
        });
    }

} }

Output:输出:

Parent text: cde
Item text: someid_1

Parent text: cde
Item text: someid_2

Parent text: cde
Item text: someid_3

Parent text: cde
Item text: someid_4

Parent text: hgw
Item text: someid_5

cde
someid_1
someid_2
someid_3
someid_4

hgw
someid_5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM