Parse xml with text and xml tags in same xml tag

Question

I wan't to parse a xml with java that looks something like this:

<sentence>This is a <a><b>long</b></a> sentence.</sentence>
<sentence>This is a second <a><b>even</b></a> longer sentence.</sentence>

As a result i need the whole sentence without the xml. I tried to parse this with dom4j. Calling the function element.getText() (current element is the sentence tag) i just get the sentence without the text in the nested xml tags.

Thanks for your help! Regards

Answer 1

将数据保留在xml标记的[CDATA]部分中

<sentence><![CDATA[This is a <a><b>long</b></a> sentence.]]></sentence>

Answer 2

You can use XPath to select all the text nodes

String getAllTextContent(Node node) {
  List<Node> nodes = node.selectNodes("descendant-or-self::text()");
  StringBuilder buf = new StringBuilder();
  for ( Node n : nodes ) {
    buf.append(n.getText());
  }
  return buf.toString();
}
// usage
System.out.println(getAllTextContent(doc.selectSingleNode("//sentence")));

Parse xml with text and xml tags in same xml tag

Question

2 answers

solution1
0 2013-04-19 10:43:55

solution2
0 ACCPTED 2013-04-19 11:46:35

Parse xml with text and xml tags in same xml tag

Question

2 answers

solution1 0 2013-04-19 10:43:55

solution2 0 ACCPTED 2013-04-19 11:46:35

solution1
0 2013-04-19 10:43:55

solution2
0 ACCPTED 2013-04-19 11:46:35