获取第一个节点的节点值

Question

I have the following XML:我有以下 XML：

<?xml version='1.0' ?>
<foo>A&gt;B</foo>

and just want to get the node value of start tag as A>B , if we use getNodeValue it will convert it to A>B which is not needed.并且只想将开始标记的节点值获取为A>B ，如果我们使用 getNodeValue ，它会将其转换为不需要的 A>B 。

Hence I decided to use the Transformer因此我决定使用 Transformer

        Document doc = getParsedDoc(abovexml);
        TransformerFactory tranFact = TransformerFactory.newInstance();
        Transformer transfor = tranFact.newTransformer();
        transfor.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        Source src = new DOMSource(node);
        StringWriter buffer = new StringWriter();
        Result dest = new StreamResult(buffer);
        transfor.transform(src, dest);
        String result = buffer.toString();

But this gives the following output as part of result as <foo>A>B</foo>但这给出了以下 output 作为<foo>A>B</foo>结果的一部分

It will be helpful if somebody could clarify, if there is an approach with which we can get A>B without doing string manipulation from the above output ( <foo>A>B</foo> )如果有人能澄清一下，如果有一种方法可以让我们在不从上述 output ( <foo>A>B</foo> ) 中进行字符串操作的情况下获得A>B将很有帮助

Answer 1

Since getNodeValue() is automatically decoding the the String.由于 getNodeValue() 会自动解码字符串。
You can use StringEscapeUtils from Apache Commons Lang to encode it again.您可以使用 Apache Commons Lang 中的 StringEscapeUtils 再次对其进行编码。

http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html http://commons.apache.org/lang/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
http://commons.apache.org/lang/ http://commons.apache.org/lang/

String nodeValue = StringEscapeUtils.escapeHtml(getNodeValue());

That would encode it into the format you want it to be in. It is not very performance friendly because you are applying encode for every node value.这会将其编码为您希望它采用的格式。它对性能不是很友好，因为您正在为每个节点值应用编码。

Answer 2

Actually getNodeValue() is not "converting" the string.实际上 getNodeValue() 不是“转换”字符串。 When the XML is parsed from a file, or produced by a transformation, the resulting information model is that the string is A>B , not A>B .当从文件中解析 XML 或通过转换生成时，得到的信息 model 是字符串是A>B ，而不是A>B 。 The latter is just a serialization form.后者只是一种序列化形式。

Another legitimate serialization form is A>B (because right angle bracket does not need to be escaped in most cases ).另一种合法的序列化形式是A>B （因为在大多数情况下不需要转义右尖括号）。 However, there may be compatibility reasons for wanting to produce A>B , especially if your output is intended to be HTML (though you didn't mention that).但是，想要生产A>B可能存在兼容性原因，特别是如果您的 output 打算成为 HTML （尽管您没有提到）。

If you have a good reason for escaping the > , then I agree with @kensen john's answer for getting that done.如果您对 escaping 有充分的理由> ，那么我同意@kensen john 的回答。

获取第一个节点的节点值

问题描述

2 个解决方案

解决方案1
0 已采纳 2011-06-03 18:46:00

解决方案2
0 2012-02-15 16:10:32

获取第一个节点的节点值

问题描述

2 个解决方案

解决方案1 0 已采纳 2011-06-03 18:46:00

解决方案2 0 2012-02-15 16:10:32

解决方案1
0 已采纳 2011-06-03 18:46:00

解决方案2
0 2012-02-15 16:10:32