简体   繁体   English

使用StAX将XML元素提取为包括属性名称空间的字符串

[英]Extract XML element as string including attribute namespace using StAX

Given the following XML string 给定以下XML字符串

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:a="http://a" xmlns:b="http://b">
  <a:element b:attribute="value">
    <subelement/>
  </a:element>
</root>

I'd like to extract the element a:element as an XML string while preserving the used namespaces using StAX. 我想在使用StAX保留使用的名称空间的同时,将元素a:element提取为XML字符串。 So I would expect 所以我期望

<?xml version="1.0" encoding="UTF-8"?>
<a:element xmlns:a="http://a" xmlns:b="http://b" b:attribute="value">
  <subelement/>
</a:element>

Following answers like https://stackoverflow.com/a/5170415/2391901 and https://stackoverflow.com/a/4353531/2391901 , I already have the following code: 按照https://stackoverflow.com/a/5170415/2391901https://stackoverflow.com/a/4353531/2391901之类的答案,我已经有以下代码:

final ByteArrayInputStream inputStream = new ByteArrayInputStream(inputString.getBytes(StandardCharsets.UTF_8));
final XMLInputFactory xmlInputFactory = XMLInputFactory.newFactory();
final XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
xmlStreamReader.nextTag();
xmlStreamReader.nextTag();
final TransformerFactory transformerFactory = TransformerFactory.newInstance();
final Transformer transformer = transformerFactory.newTransformer();
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
transformer.transform(new StAXSource(xmlStreamReader), new StreamResult(outputStream));
final String outputString = outputStream.toString(StandardCharsets.UTF_8.name());

However, the result does not contain the namespace http://b of the attribute b:attribute (using either the default StAX parser of Java 8 or the StAX parser of Aalto XML): 但是,结果不包含属性b:attribute的名称空间http://b (使用Java 8的默认StAX解析器或Aalto XML的StAX解析器):

<?xml version="1.0" encoding="UTF-8"?>
<a:element xmlns:a="http://a" b:attribute="value">
  <subelement/>
</a:element>

How do I get the expected result using StAX? 如何使用StAX获得预期的结果?

It would be cleaner to use an xslt transform to do this. 使用xslt转换会更清洁。 You're already using an identity transformer to perform output - just set it up to copy the target element instead of everything: 您已经在使用标识转换器执行输出-只需将其设置为复制目标元素即可,而不是复制所有内容:

public static void main(String[] args) throws TransformerException {

    String inputString =
        "<root xmlns:a='http://a' xmlns:b='http://b'>" +
        "  <a:element b:attribute='value'>" +
        "    <subelement/>" +
        "  </a:element>" +
        "</root>";

    String xslt = 
        "<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:a='http://a'>" +
        "    <xsl:template match='/root'>" +
        "        <xsl:copy-of select='a:element'/>" +
        "    </xsl:template>" +
        "</xsl:stylesheet>";

    TransformerFactory transformerFactory = TransformerFactory.newInstance();
    Transformer transformer = transformerFactory.newTransformer(new StreamSource(new StringReader(xslt)));
    transformer.transform(new StreamSource(new StringReader(inputString)), new StreamResult(System.out));
}

The stax subtree transform that you're using relies on some iffy behaviour of the transformer that ships with the jdk. 您正在使用的stax子树转换依赖于jdk附带的转换器的一些不稳定行为。 It didn't work when I tried it with the Saxon transformer (which complained about the trailing </root> ). 当我尝试使用Saxon变压器(它抱怨尾随</root> )时,它不起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM