简体   繁体   中英

jsoup xml parsing - child nodes not displayed

I am parsing an xml String. I am able to print only one child.

My code:

import java.util.List;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class RunReport {

    public static void main(String[] args){
        String xmlcontent="<Results><ResultSet fetchSize=\"2\">"
                + "<Row rowNumber=\"1\"><TBC_ID>29379155</TBC_ID><TBC_DATE>2013-01-31</TBC_DATE></Row>"
                + "<Row rowNumber=\"2\"><TBC_ID>29379576</TBC_ID><TBC_DATE>2013-01-31</TBC_DATE></Row>";
        Document doc = Jsoup.parse(xmlcontent);
        Elements rows =doc.getElementsByTag("Row");
        List<Element> resultSet= doc.getElementsByTag("Row");
        for(int i=0; i<resultSet.size();i++){
            Element RsRecord = resultSet.get(i);
            Elements columns = RsRecord.children();
            for(Element column:columns){
                System.out.println("Row id:"+i+",Column Node name:"+column.nodeName()+",Value="+column.ownText());
            }

        }

    }

Output:

Row id:0,Column Node name:tbc_id,Value=29379155
Row id:1,Column Node name:tbc_id,Value=29379576

Tag - 'Row' has two child nodes, but my output shows only one child.

Expected:

Row id:0,Column Node name:tbc_id,Value=29379155
Row id:0,Column Node name:tbc_date,Value=2013-01-31
Row id:1,Column Node name:tbc_id,Value=29379576
Row id:1,Column Node name:tbc_date,Value=2013-01-31

This works for me:

package test;

import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class RunReport {

public static void main(String[] args){
    String xmlcontent="<Results>"
            + "<ResultSet fetchSize=\"2\">"
            + "<data rowNumber=\"1\">"
            + "<a>29379155</a>"
            + "<b>2013-01-31</b>"
            + "</data>"
            + "<data rowNumber=\"2\">"
            + "<a>29379576</a>"
            + "<b>2013-01-31</b>"
            + "</data>"
            + "</Results>"
            + "</ResultSet>";
    Document doc = Jsoup.parse(xmlcontent);
    List<Element> resultSet = doc.getElementsByTag("data");
    for(int i=0; i<resultSet.size();i++){
        Element RsRecord = resultSet.get(i);
        Elements columns = RsRecord.children();
        for(Element column:columns){
            System.out.println("Row id:"+i+",Column Node name:"+column.nodeName()+",Value="+column.ownText());
        }

    }

}
}

My guess is that you are using a reserved work in your xml. When i use your code this is the structure printed for me:

<row rownumber="1">
 <tbc_id>
  29379155
  <tbc_date>
   2013-01-31
   <row rownumber="2">
    <tbc_id>
     29379576
     <tbc_date>
      2013-01-31
     </tbc_date>
    </tbc_id>
   </row>
  </tbc_date>
 </tbc_id>
</row>
<row rownumber="2">
 <tbc_id>
  29379576
  <tbc_date>
   2013-01-31
  </tbc_date>
 </tbc_id>
</row>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM