[英]Retrieve values from XML tag in Java
I have a set of XML string outputs from a natural language tool and need to retrieve values out of them, also provide null value to those tags that are not presented in the output string. 我从自然语言工具中获得了一组XML字符串输出,需要从中检索值,还需要为输出字符串中未显示的那些标签提供空值。 Tried to use the Java codes provided in Extracting data from XML using Java but it doesn't seem to work.
试图使用使用Java 从XML提取数据中提供的Java代码,但似乎不起作用。
Current sample tag inventory is listed below: 下面列出了当前样本标签清单:
<TimeStamp>, <Role>, <SpeakerId>, <Person>, <Location>, <Organization>
Sample XML output string: 示例XML输出字符串:
<TimeStamp>00.00.00</TimeStamp> <Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.
Desire outputs: 需求输出:
TimeStamp: 00.00.00
Role: Speaker1
SpeakerId: 1234
Person: null
Place: null
Organization: null
In order to use the Java codes provided in above link (in updated code), I inserted <Dummy>
and </Dummy>
as follows: 为了使用以上链接中提供的Java代码(在更新的代码中),我如下插入
<Dummy>
和</Dummy>
:
<Dummy><TimeStamp>00.00.00</TimeStamp><Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.</Dummy>
However, it returns dummy and null only. 但是,它仅返回dummy和null。 Since I'm still a newbie to Java, detailed explanations will be much appreciated.
由于我仍然是Java的新手,因此非常感谢详细的解释。
Try this way :D hope can help you 尝试这种方式:D希望可以帮助您
File fXmlFile = new File("yourfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
You can get child node list like this: 您可以这样获得子节点列表:
NodeList nList = doc.getElementsByTagName("staff");
Get the item like this: 得到这样的项目:
Node nNode = nList.item(temp);
This is what I ended up doing for my Java wrapper (Show TimeStamp only) 这就是我最终为Java包装器所做的事情(仅显示TimeStamp)
public class NERPost {
public String convertXML (String input) {
String nerOutput = input;
try {
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(nerOutput));
Document doc = docBuilder.parse(is);
// normalize text representation
doc.getDocumentElement ().normalize ();
NodeList listOfDummies = doc.getElementsByTagName("dummy");
for(int s=0; s<listOfDummies.getLength() ; s++){
Node firstDummyNode = listOfDummies.item(s);
if(firstDummyNode.getNodeType() == Node.ELEMENT_NODE){
Element firstDummyElement = (Element)firstDummyNode;
//Convert each entity label --------------------------------
//TimeStamp
String ts = "<TimeStamp>";
Boolean foundTs;
if (foundTs = nerOutput.contains(ts)) {
NodeList timeStampList = firstDummyElement.getElementsByTagName("TimeStamp");
//do it recursively
for (int i=0; i<timeStampList.getLength(); i++) {
Node firstTimeStampNode = timeStampList.item(i);
Element timeStampElement = (Element)firstTimeStampNode;
NodeList textTSList = timeStampElement.getChildNodes();
String timeStampOutput = ((Node)textTSList.item(0)).getNodeValue().trim();
System.out.println ("<TimeStamp>" + timeStampOutput + "</TimeStamp>\n")
} //end for
}//end if
//other XML tags
//.....
}//end if
}//end for
}
catch...
}//end try
}}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.