简体   繁体   中英

JSoup, how to return data from a dynamic <a href> tag

Very new to JSoup, trying to retrieve a changeable value that is stored within an tag, specifically from the following website and html. Snapshot of HTML

the results after "consitituency/" are changeable and dependent on the input of the user. I am able to retrieve the h2 tags themselves but not the information within. At the moment the best return I can get is just tags using the method below

The desired return would be something that I can substring down into

Dublin Bay South

The actual return is

<well.col-md-4.h2></well.col-md-4.h2>

        private String jSoupTDRequest(String aLine1, String aLine3) throws IOException {
        String constit = "";
        String h2 = "h2";
     String url = "https://www.whoismytd.com/search?utf8=✓&form-input="+aLine1+"%2C+"+aLine3+"+Ireland";
        //Switch to try catch if time
        Document doc = Jsoup.connect(url)
                .timeout(6000).get();

        //Scrape elements from relevant section

        Elements body = doc.select("well.col-md-4.h2");
        Element e = new Element("well.col-md-4.h2");
        constit = e.toString();
        

        return constit;

I am extremely new to JSoup and scraping in general. Would appreciate any input from someone who knows what they're doing or any alternate ways to try and get the desired result

Change your scraping elements from relevant section code as follows:

  • Select the very first <div class="well"> element first.

     Element tdsDiv = doc.select("div.well").first();
  • Select the very first <a> link element next. This link points to the constituency.

     Element constLink = tdsDiv.select("a").first();
  • Get the constituency name by grabbing this link's text content.

     constit = constLink.text();
import org.junit.jupiter.api.Test;

import java.io.IOException;

@DisplayName("JSoup, how to return data from a dynamic <a href> tag")
class JsoupQuestionTest {
    private static final String URL = "https://www.whoismytd.com/search?utf8=%E2%9C%93&form-input=Kildare%20Street%2C%20Dublin%2C%20Ireland";
    @Test
    void findSomeText() throws IOException {
        String expected = "Dublin Bay South";
        Document document = Jsoup.connect(URL).get();
        String actual = document.getElementsByAttributeValue("href", "/constituency/dublin-bay-south").text();
        Assertions.assertEquals(expected, actual);

    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM