简体   繁体   English

如何在Java中使用Dom4j获取XML的节点内容

[英]How to get node content of XML with Dom4j in java

I have a XML file like: 我有一个XML文件,例如:

<description>
  <text>blahblah</text>
  <code>code</code>
  <text>blah</text>
</description>

I've navigated to the node description , and I want to read the full content including the <text> and so on. 我已经导航到节点description ,并且我想阅读完整的内容,包括<text>等等。

I've used the getText() , but it returned empty string. 我使用过getText() ,但是它返回了空字符串。
I've used the getStringValue() , but it filtered all <text> . 我使用过getStringValue() ,但它过滤了所有<text>
I've used the asXML() , the result is close, but the result contains <description> which I don't want. 我使用了asXML() ,结果很接近,但是结果包含了我不想要的<description>

Is there a method to get the XML content of a element? 有没有一种方法来获取元素的XML内容?

Something like this: 像这样:

public static void main(String[] args) throws DocumentException {
  String xml = "<description><text>blahblah</text><code>code</code><text>blah</text></description>";
  SAXReader reader = new SAXReader();
  Document doc = reader.read(new StringReader(xml));
  Element description = doc.getRootElement();
  String content = getContent(description);
  System.out.println(content);
}

private static String getContent(Element element) {
  StringBuilder builder = new StringBuilder();
  for (Iterator<Element> i = element.elementIterator(); i.hasNext();) {
    Element e = i.next();
    builder.append(e.asXML());
  }
  return builder.toString();
}

Note that if the element has text content itself, this won't return the text content, only the child nodes. 请注意,如果元素本身具有文本内容,则不会返回文本内容,只会返回子节点。

Assume that document is and instance of org.dom4j.Document , then 假设该documentorg.dom4j.Document实例,然后

String xPath = "description";
List<Node> nodes = document.selectNodes( xPath );
for (Node node : nodes) {
 node.asXML()
}

Just want to add to the accepted answer by qwerky: 只想添加到qwerky接受的答案中:

To ALSO be able to parse the contents of text only elements (ie it doesn't contain nested xml): 还可以解析仅文本元素的内容(即,它不包含嵌套的xml):

public static String getContent(Element element) {
    if (element.isTextOnly())
        return element.getText();
    StringBuilder sb = new StringBuilder();
    Element currElement = null;
    for (Iterator<Element> iterator = element.elementIterator() ; iterator.hasNext() ; /* Continue till done */) {
        currElement = iterator.next();
        sb.append(currElement.asXML());
    }
    return sb.toString();
}

So I added the following at the start of the method: 因此,我在方法开始时添加了以下内容:

if (element.isTextOnly())
    return element.getText();

您应该看一下使用XPath的方法: http : //www.ibm.com/developerworks/library/x-javaxpathapi/index.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM