简体   繁体   中英

How do I parse a HTML string to split by parent div's?

So I have a html string like this one below:

<div class="row xyz"> 

    <!--Multiple Other div's and tags-->

</div>

<div class="row xws"> 

    <!--Multiple Other div's and tags-->

</div>

<div class="row daze"> 

    <!--Multiple Other div's and tags-->

</div>

As you can see it has 3 parent div's !

How can I split my HTML string with JAVA by the number of parent div's and without using the class name "row xyz" as it dynamically generated, in this case 3 so I will get :

String div1

String div2

String div3

where div1 =

<div class="row xyz"> 

    <!--Multiple Other div's and tags-->

</div>

AND

div2 =

<div class="row xws"> 

    <!--Multiple Other div's and tags-->

</div>

AND

div3 =

<div class="row daze"> 

    <!--Multiple Other div's and tags-->

</div>

Try using jsoup like the following:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class MyClass {
    public static void main(String args[]) {
        String html = "<html><head/><body>"+
        "<div class=\"row xyz\"> <div>div1</div> <div>div1_1</div> </div>"+
        "<div class=\"row xws\"> <div>div2</div> </div>" + 
        "<div class=\"row daze\">  <div>div3</div></div>"+
        "</body></html>";
        Document document = Jsoup.parse(html);
        Elements divs = document.select("body > div");//get first level of divs

        for (int i=0; i< divs.size(); i++) {
            System.out.println(String.format("div%d = %s", i + 1, divs.get(i).outerHtml()));
        }
    }
}

the output will be:

    div1 = <div class="row xyz"> 
 <div>
  div1
 </div> 
 <div>
  div1_1
 </div> 
</div>
div2 = <div class="row xws"> 
 <div>
  div2
 </div> 
</div>
div3 = <div class="row daze"> 
 <div>
  div3
 </div>
</div>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM