从Java中的XML标记检索值

Question

I have a set of XML string outputs from a natural language tool and need to retrieve values out of them, also provide null value to those tags that are not presented in the output string. 我从自然语言工具中获得了一组XML字符串输出，需要从中检索值，还需要为输出字符串中未显示的那些标签提供空值。 Tried to use the Java codes provided in Extracting data from XML using Java but it doesn't seem to work. 试图使用使用Java 从XML提取数据中提供的Java代码，但似乎不起作用。

Current sample tag inventory is listed below: 下面列出了当前样本标签清单：

<TimeStamp>, <Role>, <SpeakerId>, <Person>, <Location>, <Organization>

Sample XML output string: 示例XML输出字符串：

<TimeStamp>00.00.00</TimeStamp> <Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.

Desire outputs: 需求输出：

TimeStamp: 00.00.00
Role: Speaker1
SpeakerId: 1234
Person: null
Place: null
Organization: null

In order to use the Java codes provided in above link (in updated code), I inserted <Dummy> and </Dummy> as follows: 为了使用以上链接中提供的Java代码（在更新的代码中），我如下插入<Dummy>和</Dummy> ：

<Dummy><TimeStamp>00.00.00</TimeStamp><Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.</Dummy>

However, it returns dummy and null only. 但是，它仅返回dummy和null。 Since I'm still a newbie to Java, detailed explanations will be much appreciated. 由于我仍然是Java的新手，因此非常感谢详细的解释。

Answer 1

Try this way :D hope can help you 尝试这种方式：D希望可以帮助您

File fXmlFile = new File("yourfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);

You can get child node list like this: 您可以这样获得子节点列表：

NodeList nList = doc.getElementsByTagName("staff");

Get the item like this: 得到这样的项目：

Node nNode = nList.item(temp);

Example Site 范例网站

Answer 2

This is what I ended up doing for my Java wrapper (Show TimeStamp only) 这就是我最终为Java包装器所做的事情（仅显示TimeStamp）

  public class NERPost {

      public String convertXML (String input) {
      String nerOutput = input;
      try {
           DocumentBuilderFactory docBuilderFactory = 
           DocumentBuilderFactory.newInstance();
           DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
           InputSource is = new InputSource();            
           is.setCharacterStream(new StringReader(nerOutput));       
           Document doc = docBuilder.parse(is);

        // normalize text representation
        doc.getDocumentElement ().normalize ();
        NodeList listOfDummies = doc.getElementsByTagName("dummy");


        for(int s=0; s<listOfDummies.getLength() ; s++){
            Node firstDummyNode = listOfDummies.item(s);
            if(firstDummyNode.getNodeType() == Node.ELEMENT_NODE){
               Element firstDummyElement = (Element)firstDummyNode;

         //Convert each entity label --------------------------------

          //TimeStamp
               String ts = "<TimeStamp>";
               Boolean foundTs;

               if (foundTs = nerOutput.contains(ts)) {                    
           NodeList timeStampList = firstDummyElement.getElementsByTagName("TimeStamp");

          //do it recursively  
                for (int i=0; i<timeStampList.getLength(); i++) {       
                Node firstTimeStampNode = timeStampList.item(i);
                Element timeStampElement = (Element)firstTimeStampNode;
                NodeList textTSList = timeStampElement.getChildNodes();
                String timeStampOutput = ((Node)textTSList.item(0)).getNodeValue().trim();
                System.out.println ("<TimeStamp>" + timeStampOutput + "</TimeStamp>\n")
                   } //end for
                }//end if
             //other XML tags
              //.....
               }//end if
              }//end for
           }
            catch...
              }//end try
                }}

从Java中的XML标记检索值

问题描述

2 个解决方案

解决方案1
0 2013-06-05 02:54:35

解决方案2
0 已采纳 2013-08-03 22:24:16

从Java中的XML标记检索值

问题描述

2 个解决方案

解决方案1 0 2013-06-05 02:54:35

解决方案2 0 已采纳 2013-08-03 22:24:16

解决方案1
0 2013-06-05 02:54:35

解决方案2
0 已采纳 2013-08-03 22:24:16