简体   繁体   中英

Retrieve XML Element names with Java from unknown message format

I am parsing XML from lots of JMS messaging topics, so the structure of each message varies a lot and I'd like to make one general tool to parse them all.

To start, all I want to do is get the element names:

<gui-action>
  <action>some action</action>
  <params>
    <param1>blue</param1>
    <param2>tall</param2>
  <params>
</gui-action>

I just want to retrieve the strings "gui-action", "action", "params", "param1", and "param2." Duplicates are just fine.

I've tried using org.w3c.dom.Node, Element, NodeLists and I'm not having much luck. I keep getting the element values, not the names.

private Element root;
private Document doc;
private NodeList nl;

//messageStr is passed in elsewhere in the code
//but is a string of the full XML message.
doc = xmlParse( messageStr );
root = doc.getDocumentElement();

nl = root.getChildNodes();
int size = nl.getLength();
for (int i=0; i<size; i++) {
    log.info( nl.item(i).getNodeName() );
}





public Document xmlParse( String xml ){
  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  DocumentBuilder db;
  InputSource is;

  try {
    //Using factory get an instance of document builder
    db = dbf.newDocumentBuilder();
    is = new InputSource(new StringReader( xml ) );

    doc = db.parse( is );

  } catch(ParserConfigurationException pce) {
      pce.printStackTrace();
  } catch(SAXException se) {
      se.printStackTrace();
  } catch(IOException ioe) {
      ioe.printStackTrace();
  }
  return doc;
  //parse using builder to get DOM representation of the XML file
}

My logged "parsed" XML looks like this:
#text
action
#text
params
#text

Figured it out. I was iterating over only the child nodes, and not including the parent. So now I just filter out the #texts, and include the parent. Derp.

        log.info(root.getNodeName() );
        for (int i=0; i<size; i++) {
            nodeName = nl.item(i).getNodeName();
            if( nodeName != "#text" ) {
                log.info( nodeName );
            }
        }

Now if anyone knows a way to get a NodeList of the entire document, that would be awesome.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM