如何使用stax / stax2获取XML元素路径？

Question

I want to get element path while parsing XML using java StAX2 parser. 我想在使用Java StAX2解析器解析XML时获取元素路径。 How to get information about the current element path? 如何获取有关当前元素路径的信息？

<root>
  <a><b>x</b></a>
</root>

In this example the path is /root/a/b . 在此示例中，路径为/root/a/b 。

Answer 1

Keep a stack. 保持堆栈。 Push the element name on START_ELEMENT and pop it on END_ELEMENT. 在START_ELEMENT上推送元素名称，然后在END_ELEMENT上弹出它。

Here's a short example. 这是一个简短的例子。 It does nothing other than print the path of the element being processed. 除了打印要处理的元素的路径外，它什么都不做。

public static void main(String[] args) throws IOException, XMLStreamException {
    try (FileInputStream in = new FileInputStream("test.xml")) {

        XMLInputFactory factory = XMLInputFactory.newFactory();
        XMLStreamReader reader = factory.createXMLStreamReader(in);

        LinkedList<String> path = new LinkedList<>();

        int next;
        while ((next = reader.next()) != XMLStreamConstants.END_DOCUMENT) {
            switch (next) {
                case XMLStreamConstants.START_ELEMENT:
                    // push the name of the current element onto the stack
                    path.addLast(reader.getLocalName());
                    // print the path with '/' delimiters
                    System.out.println("Reading /" + String.join("/", path));
                    break;

                case XMLStreamConstants.END_ELEMENT:
                    // pop the name of the element being closed
                    path.removeLast();
                    break;
            }
        }
    }
}

Answer 2

"The chronicler's duty" “编年史家的职责”

Method 1: dedicated stack, @teppic suggestion 方法1：专用堆栈，@ teppic建议

try (InputStream in = new ByteArrayInputStream(xml.getBytes())) {
    final XMLInputFactory2 factory = (XMLInputFactory2) XMLInputFactory.newInstance();
    final XMLStreamReader2 reader = (XMLStreamReader2) factory.createXMLStreamReader(in);
    Stack<String> pathStack = new Stack<>();
    while (reader.hasNext()) {
        reader.next();
        if (reader.isStartElement()) {
            pathStack.push(reader.getLocalName());
            processPath('/' + String.join("/", pathStack));
        } else if (reader.isEndElement()) {
            pathStack.pop();
        }
    }
}

Method 2 (ugly): hacking Woodstox's `InputElementStack` 方法2（丑陋）：攻击Woodstox的 `InputElementStack`

Implementing adapter to access InputElementStack , its protected mCurrElement and interate parents (this slows down algoritm). 实现适配器以访问InputElementStack ，其受保护的mCurrElement和内部父代（这会减慢算法速度）。

 package com.ctc.wstx.sr; import java.util.LinkedList; public class StackUglyAdapter { public static String PATH_SEPARATOR = "/"; private InputElementStack stack; public StackUglyAdapter(InputElementStack stack) { this.stack = stack; } public String getCurrElementLocalName() { return this.stack.mCurrElement.mLocalName; } public String getCurrElementPath() { LinkedList<String> list = new LinkedList<String>(); Element el = this.stack.mCurrElement; while (el != null) { list.addFirst(el.mLocalName); el = el.mParent; } return PATH_SEPARATOR+String.join(PATH_SEPARATOR,list); } }

example of use: 使用示例：

 try (final InputStream in = new ByteArrayInputStream(xml.getBytes())) { final XMLInputFactory2 factory = (XMLInputFactory2) XMLInputFactory.newInstance(); final XMLStreamReader2 reader = (XMLStreamReader2) factory.createXMLStreamReader(in); final StackUglyAdapter stackAdapter = new StackUglyAdapter(((StreamReaderImpl) reader).getInputElementStack()); while (reader.hasNext()) { reader.next(); if (reader.isStartElement()) { processPath(stackAdapter.getCurrElementPath()); } } }

Method 1 with dedicated stack is better, because is API implementation-independent and is just as fast as the Method 2 . 具有专用堆栈的方法1更好，因为它与API实现无关，并且与方法2一样快。

如何使用stax / stax2获取XML元素路径？

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-12-13 09:20:15

解决方案2
1 2016-12-15 01:23:59

Method 1: dedicated stack, @teppic suggestion 方法1：专用堆栈，@ teppic建议

Method 2 (ugly): hacking Woodstox's `InputElementStack` 方法2（丑陋）：攻击Woodstox的 `InputElementStack`

如何使用stax / stax2获取XML元素路径？

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-12-13 09:20:15

解决方案2 1 2016-12-15 01:23:59

Method 1: dedicated stack, @teppic suggestion 方法1：专用堆栈，@ teppic建议

Method 2 (ugly): hacking Woodstox's InputElementStack 方法2（丑陋）：攻击Woodstox的 InputElementStack

解决方案1
2 已采纳 2016-12-13 09:20:15

解决方案2
1 2016-12-15 01:23:59

Method 2 (ugly): hacking Woodstox's `InputElementStack` 方法2（丑陋）：攻击Woodstox的 `InputElementStack`