[英]Retrieve XML Element names with Java from unknown message format
I am parsing XML from lots of JMS messaging topics, so the structure of each message varies a lot and I'd like to make one general tool to parse them all. 我正在从许多JMS消息传递主题中解析XML,因此每条消息的结构变化很大,我想制作一个通用工具来解析所有消息。
To start, all I want to do is get the element names: 首先,我要做的就是获取元素名称:
<gui-action>
<action>some action</action>
<params>
<param1>blue</param1>
<param2>tall</param2>
<params>
</gui-action>
I just want to retrieve the strings "gui-action", "action", "params", "param1", and "param2." 我只想检索字符串“ gui-action”,“ action”,“ params”,“ param1”和“ param2”。 Duplicates are just fine.
重复就好。
I've tried using org.w3c.dom.Node, Element, NodeLists and I'm not having much luck. 我已经尝试使用org.w3c.dom.Node,Element,NodeLists,但运气并不好。 I keep getting the element values, not the names.
我一直在获取元素值,而不是名称。
private Element root;
private Document doc;
private NodeList nl;
//messageStr is passed in elsewhere in the code
//but is a string of the full XML message.
doc = xmlParse( messageStr );
root = doc.getDocumentElement();
nl = root.getChildNodes();
int size = nl.getLength();
for (int i=0; i<size; i++) {
log.info( nl.item(i).getNodeName() );
}
public Document xmlParse( String xml ){
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db;
InputSource is;
try {
//Using factory get an instance of document builder
db = dbf.newDocumentBuilder();
is = new InputSource(new StringReader( xml ) );
doc = db.parse( is );
} catch(ParserConfigurationException pce) {
pce.printStackTrace();
} catch(SAXException se) {
se.printStackTrace();
} catch(IOException ioe) {
ioe.printStackTrace();
}
return doc;
//parse using builder to get DOM representation of the XML file
}
My logged "parsed" XML looks like this: 我记录的“已解析” XML如下所示:
#text #文本
action 行动
#text #文本
params PARAMS
#text #文本
Figured it out. 弄清楚了。 I was iterating over only the child nodes, and not including the parent.
我仅在子节点上进行迭代,而不在父节点上进行迭代。 So now I just filter out the #texts, and include the parent.
所以现在我只过滤掉#texts,并包含父项。 Derp.
DERP。
log.info(root.getNodeName() );
for (int i=0; i<size; i++) {
nodeName = nl.item(i).getNodeName();
if( nodeName != "#text" ) {
log.info( nodeName );
}
}
Now if anyone knows a way to get a NodeList of the entire document, that would be awesome. 现在,如果有人知道一种获取整个文档的NodeList的方法,那就太好了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.