[英]Get text from XML between tags in Java
I have xml entries like the following. 我有如下的xml条目。 I want to extract everything after the d:index tag closes to the end of entry. 我想在d:index标记接近条目结尾后提取所有内容。
<d:entry id="some_id" d:title="some_title">
<d:index d:value="some_value"/>
<h1>headlines</h1>
<p>paragraphs</p>
<div>
<ul>
<li>lists</li>
</ul>
</div>
text like that
</d:entry>
I tried using 我尝试使用
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(file);
doc.getDocumentElement().normalize();
eList = doc.getElementsByTagName("d:entry");
for (int i = 0; i < eList.getLength(); i++){
Node nNode = eList.item(i);
textList[i] = nNode.getTextContent();
}
But, .getTextContent() only gives me 'text like that' and not 但是,.getTextContent()只给我“那样的文本”,而没有
<h1>headlines</h1>
<p>paragraphs</p>
<div>
<ul>
<li>lists</li>
</ul>
</div>
text like that
Depending on what you exactly want to do, you could do something like this: 根据您确实要执行的操作,可以执行以下操作:
import java.io.File;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
public class Arbeiter {
public void arbeiten(File datei)
{
Document doc = getDoc(datei);
Element element = doc.getDocumentElement();
print(element);
}
private Document getDoc(File datei)
{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = null;
try {
DocumentBuilder db = dbf.newDocumentBuilder();
doc = db.parse(datei);
} catch (ParserConfigurationException | SAXException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return doc;
}
private void print(Node node)
{
for (int i=0; i<node.getChildNodes().getLength(); i++)
{
print(node.getFirstChild());
}
if(node.getTextContent()!=null)
{
System.out.println(node.getTextContent());
}
}
}
The output is: 输出为:
headlines
paragraphs
lists
text like that
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.