简体   繁体   中英

Parsing XML in Java to extract all nodes & attributes

I am stuck on an issue trying to parse some XML documents to obtain the output i require.

Take this sample XML:

 <root> <ZoneRule Name="After" RequiresApproval="false"> <Zone> <WSAZone ConsecutiveDayNumber="1"> <DaysOfWeek> <WSADaysOfWeek Saturday="false"/> </DaysOfWeek> <SelectedLimits> </SelectedLimits> <SelectedHolidays> </SelectedHolidays> </WSAZone> </Zone> </ZoneRule> <ZoneRule Name="Before" RequiresApproval="false"> <Zone> <WSAZone ConsecutiveDayNumber="3"> <DaysOfWeek> <WSADaysOfWeek Saturday="true"/> </DaysOfWeek> <SelectedLimits> </SelectedLimits> <SelectedHolidays> </SelectedHolidays> </WSAZone> </Zone> </ZoneRule> </root> 

What i am attempting to do is to be able to ignore the root tag (this is working so no problems here), and treat each of the "ZoneRule's" as its own individual block.

Once i have each ZoneRule isolated i need to extract all of the nodes and attributes to allow me to to create a string to query a database to check if it exists (this part is also working).

The issue i am having is that in my code i cannot separate out each individual ZoneRule block, for some reason it is being processed all as one.

My sample code is as follows:

 public String testXML = ""; int andCount = 0; public void printNote(NodeList nodeList) { for (int count = 0; count < nodeList.getLength(); count++) { Node tempNode = nodeList.item(count); // make sure it's element node. if (tempNode.getNodeType() == Node.ELEMENT_NODE) { if (tempNode.hasAttributes()))) { // get attributes names and values NamedNodeMap nodeMap = tempNode.getAttributes(); for (int i = 0; i < nodeMap.getLength(); i++) { Node node = nodeMap.item(i); if (andCount == 0) { testXML = testXML + "XMLDataAsXML.exist('//" + tempNode.getNodeName() + "[./@" + node.getNodeName() + "=\\"" + node.getNodeValue() + "\\"]')=1 \\n"; } else { testXML = testXML + " and XMLDataAsXML.exist('//" + tempNode.getNodeName() + "[./@" + node.getNodeName() + "=\\"" + node.getNodeValue() + "\\"]')=1 \\n"; } andCount = andCount + 1; } } if (tempNode.hasChildNodes()) { // loop again if has child nodes printNote(tempNode.getChildNodes()); } } } } private void jButton2ActionPerformed(java.awt.event.ActionEvent evt) { try { File file = new File("C:\\\\Test.xml"); DocumentBuilder dBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document doc = dBuilder.parse(file); //System.out.println("Root element :" + doc.getDocumentElement().getNodeName()); if (doc.hasChildNodes()) { printNote(doc.getChildNodes()); } } catch (Exception e) { System.out.println(e.getMessage()); } System.out.println(testXML); } 

Which produces this output (both nodes combined).

XMLDataAsXML.exist('//ZoneRule[./@Name="After"]')=1 
 and XMLDataAsXML.exist('//ZoneRule[./@RequiresApproval="false"]')=1 
 and XMLDataAsXML.exist('//WSAZone[./@ConsecutiveDayNumber="1"]')=1 
 and XMLDataAsXML.exist('//WSADaysOfWeek[./@Saturday="false"]')=1 
 and XMLDataAsXML.exist('//ZoneRule[./@Name="Before"]')=1 
 and XMLDataAsXML.exist('//ZoneRule[./@RequiresApproval="false"]')=1 
 and XMLDataAsXML.exist('//WSAZone[./@ConsecutiveDayNumber="3"]')=1 
 and XMLDataAsXML.exist('//WSADaysOfWeek[./@Saturday="true"]')=1 

What i am actually after is this (excuse the incomplete SQL statements):

XMLDataAsXML.exist('//ZoneRule[./@Name="After"]')=1 
 and XMLDataAsXML.exist('//ZoneRule[./@RequiresApproval="false"]')=1 
 and XMLDataAsXML.exist('//WSAZone[./@ConsecutiveDayNumber="1"]')=1 
 and XMLDataAsXML.exist('//WSADaysOfWeek[./@Saturday="false"]')=1 


XMLDataAsXML.exist XMLDataAsXML.exist('//ZoneRule[./@Name="Before"]')=1 
 and XMLDataAsXML.exist('//ZoneRule[./@RequiresApproval="false"]')=1 
 and XMLDataAsXML.exist('//WSAZone[./@ConsecutiveDayNumber="3"]')=1 
 and XMLDataAsXML.exist('//WSADaysOfWeek[./@Saturday="true"]')=1 

The XML that will be parsed will not always be exactly like above so i cannot use hardcoded xPaths etc - i need to dynamically loop through the document, looking for the ZoneRule node as my base (i will dynamically generate this value based on the file received) and then extract all the required info.

I am completely open to better methods than what i have tried above.

Thanks very much.

In your code, the testXML and andCount are declared outside the printNote method and are not being reset during iterations.

You start with the first ZoneRule, generate the correct text during the first for iterations (lets forget about the recursion) and now you move to the next ZoneRule, but testXML contains the whole generated text and the andCount is lager then 0 so you keep attaching the text generated for the next ZoneRule.

You should reset the andCount and testXML at the beggining of each iteriation of the for loop. But then you 'recursive' children would not be rendered correctly.

So either you need two methods one to deal with top level ZoneRule elements and another for its children, or much better, instead of appending to text to shared variable, you should redisng your method so they would return String value which then can be appended correctly (with and or without, withou new line or without) at the place when it is recursively callled.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM