[英]Extract XML element as string including attribute namespace using StAX
Given the following XML string 给定以下XML字符串
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns:a="http://a" xmlns:b="http://b">
<a:element b:attribute="value">
<subelement/>
</a:element>
</root>
I'd like to extract the element a:element
as an XML string while preserving the used namespaces using StAX. 我想在使用StAX保留使用的名称空间的同时,将元素
a:element
提取为XML字符串。 So I would expect 所以我期望
<?xml version="1.0" encoding="UTF-8"?>
<a:element xmlns:a="http://a" xmlns:b="http://b" b:attribute="value">
<subelement/>
</a:element>
Following answers like https://stackoverflow.com/a/5170415/2391901 and https://stackoverflow.com/a/4353531/2391901 , I already have the following code: 按照https://stackoverflow.com/a/5170415/2391901和https://stackoverflow.com/a/4353531/2391901之类的答案,我已经有以下代码:
final ByteArrayInputStream inputStream = new ByteArrayInputStream(inputString.getBytes(StandardCharsets.UTF_8));
final XMLInputFactory xmlInputFactory = XMLInputFactory.newFactory();
final XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inputStream);
xmlStreamReader.nextTag();
xmlStreamReader.nextTag();
final TransformerFactory transformerFactory = TransformerFactory.newInstance();
final Transformer transformer = transformerFactory.newTransformer();
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
transformer.transform(new StAXSource(xmlStreamReader), new StreamResult(outputStream));
final String outputString = outputStream.toString(StandardCharsets.UTF_8.name());
However, the result does not contain the namespace http://b
of the attribute b:attribute
(using either the default StAX parser of Java 8 or the StAX parser of Aalto XML): 但是,结果不包含属性
b:attribute
的名称空间http://b
(使用Java 8的默认StAX解析器或Aalto XML的StAX解析器):
<?xml version="1.0" encoding="UTF-8"?>
<a:element xmlns:a="http://a" b:attribute="value">
<subelement/>
</a:element>
How do I get the expected result using StAX? 如何使用StAX获得预期的结果?
It would be cleaner to use an xslt transform to do this. 使用xslt转换会更清洁。 You're already using an identity transformer to perform output - just set it up to copy the target element instead of everything:
您已经在使用标识转换器执行输出-只需将其设置为复制目标元素即可,而不是复制所有内容:
public static void main(String[] args) throws TransformerException {
String inputString =
"<root xmlns:a='http://a' xmlns:b='http://b'>" +
" <a:element b:attribute='value'>" +
" <subelement/>" +
" </a:element>" +
"</root>";
String xslt =
"<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:a='http://a'>" +
" <xsl:template match='/root'>" +
" <xsl:copy-of select='a:element'/>" +
" </xsl:template>" +
"</xsl:stylesheet>";
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer(new StreamSource(new StringReader(xslt)));
transformer.transform(new StreamSource(new StringReader(inputString)), new StreamResult(System.out));
}
The stax subtree transform that you're using relies on some iffy behaviour of the transformer that ships with the jdk. 您正在使用的stax子树转换依赖于jdk附带的转换器的一些不稳定行为。 It didn't work when I tried it with the Saxon transformer (which complained about the trailing
</root>
). 当我尝试使用Saxon变压器(它抱怨尾随
</root>
)时,它不起作用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.