简体   繁体   中英

Extract XML element as string including attribute namespace using StAX

Given the following XML string

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:a="http://a" xmlns:b="http://b">
  <a:element b:attribute="value">
    <subelement/>
  </a:element>
</root>

I'd like to extract the element a:element as an XML string while preserving the used namespaces using StAX. So I would expect

<?xml version="1.0" encoding="UTF-8"?>
<a:element xmlns:a="http://a" xmlns:b="http://b" b:attribute="value">
  <subelement/>
</a:element>

Following answers like https://stackoverflow.com/a/5170415/2391901 and https://stackoverflow.com/a/4353531/2391901 , I already have the following code:

final ByteArrayInputStream inputStream = new ByteArrayInputStream(inputString.getBytes(StandardCharsets.UTF_8));
final XMLInputFactory xmlInputFactory = XMLInputFactory.newFactory();
final XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
xmlStreamReader.nextTag();
xmlStreamReader.nextTag();
final TransformerFactory transformerFactory = TransformerFactory.newInstance();
final Transformer transformer = transformerFactory.newTransformer();
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
transformer.transform(new StAXSource(xmlStreamReader), new StreamResult(outputStream));
final String outputString = outputStream.toString(StandardCharsets.UTF_8.name());

However, the result does not contain the namespace http://b of the attribute b:attribute (using either the default StAX parser of Java 8 or the StAX parser of Aalto XML):

<?xml version="1.0" encoding="UTF-8"?>
<a:element xmlns:a="http://a" b:attribute="value">
  <subelement/>
</a:element>

How do I get the expected result using StAX?

It would be cleaner to use an xslt transform to do this. You're already using an identity transformer to perform output - just set it up to copy the target element instead of everything:

public static void main(String[] args) throws TransformerException {

    String inputString =
        "<root xmlns:a='http://a' xmlns:b='http://b'>" +
        "  <a:element b:attribute='value'>" +
        "    <subelement/>" +
        "  </a:element>" +
        "</root>";

    String xslt = 
        "<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:a='http://a'>" +
        "    <xsl:template match='/root'>" +
        "        <xsl:copy-of select='a:element'/>" +
        "    </xsl:template>" +
        "</xsl:stylesheet>";

    TransformerFactory transformerFactory = TransformerFactory.newInstance();
    Transformer transformer = transformerFactory.newTransformer(new StreamSource(new StringReader(xslt)));
    transformer.transform(new StreamSource(new StringReader(inputString)), new StreamResult(System.out));
}

The stax subtree transform that you're using relies on some iffy behaviour of the transformer that ships with the jdk. It didn't work when I tried it with the Saxon transformer (which complained about the trailing </root> ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM