jsoup: parse data of certain tag which is just after a particular tag

Question

I am trying to parse certain information through jsoup in Java from last 3 days -_-, this is my code:

Document document = Jsoup.connect(urlofpage).get();
Elements links = document.select(".contentBox");

    for (Element link : links) {
        // String name = link.text();
        String title = link.select("h2").text();
        String content = link.select("p").text();
        System.out.println(title);
        System.out.println(content);
    }

It is fetching the data as it is directed, fetching the data of h2 and p separated, but the problem is, I want to parse the data inside of <p> tag which is just after every <h2> tag.

For example (HTML content):

<h2>main content</h2>
<div class="acx"><div>
<p>content</p>
<p>content 2</p>

<h2>content 2</h2>
<div class="acx"><div>
<p>new content od 2</p>
<p>new 2</p>

Now it should fetch like (in array):

array[0] = "content content 2",
array[1] = "new content od 2 new 2",

Any solutions?

Answer 1

You can play with "~" next element selector. For example

link.select("h2 ~ p").get(0).text(); // returns "content"
link.select("h2 ~ p").get(1).text(); // returns "new content od 2"

Answer 2

Just use your initial approach to iterate all necessary tags within selected .contentBox class:

Document document = Jsoup.connect(urlofpage).get();
Elements links = document.select(".contentBox");

       for (Element link : links) {
            for (Element h2Tag : link.select("h2"))
            {
               System.out.println(h2Tag.text());
            }
            for (Element pTag : link.select("p"))
            {
               System.out.println(pTag.text());
            }
         }

jsoup: parse data of certain tag which is just after a particular tag

Question

2 answers

solution1
0 2017-05-03 19:18:49

solution2
0 2017-05-03 19:28:30

jsoup: parse data of certain tag which is just after a particular tag

Question

2 answers

solution1 0 2017-05-03 19:18:49

solution2 0 2017-05-03 19:28:30

solution1
0 2017-05-03 19:18:49

solution2
0 2017-05-03 19:28:30