[英]JSoup, how to return data from a dynamic <a href> tag
Very new to JSoup, trying to retrieve a changeable value that is stored within an tag, specifically from the following website and html. Snapshot of HTML JSoup 的新手,试图检索存储在标签中的可变值,具体来自以下网站和 html。HTML的快照
the results after "consitituency/" are changeable and dependent on the input of the user. “consitituency/”之后的结果是可变的,取决于用户的输入。 I am able to retrieve the h2 tags themselves but not the information within.
我能够自己检索 h2 标签,但不能检索其中的信息。 At the moment the best return I can get is just tags using the method below
目前我能得到的最好回报就是使用下面的方法标记
The desired return would be something that I can substring down into期望的回报将是我可以将 substring 归结为
Dublin Bay South
都柏林湾南
The actual return is实际回报是
<well.col-md-4.h2></well.col-md-4.h2>
<well.col-md-4.h2></well.col-md-4.h2>
private String jSoupTDRequest(String aLine1, String aLine3) throws IOException {
String constit = "";
String h2 = "h2";
String url = "https://www.whoismytd.com/search?utf8=✓&form-input="+aLine1+"%2C+"+aLine3+"+Ireland";
//Switch to try catch if time
Document doc = Jsoup.connect(url)
.timeout(6000).get();
//Scrape elements from relevant section
Elements body = doc.select("well.col-md-4.h2");
Element e = new Element("well.col-md-4.h2");
constit = e.toString();
return constit;
I am extremely new to JSoup and scraping in general.一般来说,我对 JSoup 和抓取非常陌生。 Would appreciate any input from someone who knows what they're doing or any alternate ways to try and get the desired result
非常感谢知道自己在做什么的人的任何意见或尝试获得所需结果的任何替代方法
Change your scraping elements from relevant section code as follows:从相关部分代码中更改您的抓取元素,如下所示:
Select the very first <div class="well">
element first. Select 首先是第一个
<div class="well">
元素。
Element tdsDiv = doc.select("div.well").first();
Select the very first <a>
link element next. Select 接下来是第一个
<a>
链接元素。 This link points to the constituency.此链接指向选区。
Element constLink = tdsDiv.select("a").first();
Get the constituency name by grabbing this link's text content.通过抓取此链接的文本内容获取选区名称。
constit = constLink.text();
import org.junit.jupiter.api.Test;
import java.io.IOException;
@DisplayName("JSoup, how to return data from a dynamic <a href> tag")
class JsoupQuestionTest {
private static final String URL = "https://www.whoismytd.com/search?utf8=%E2%9C%93&form-input=Kildare%20Street%2C%20Dublin%2C%20Ireland";
@Test
void findSomeText() throws IOException {
String expected = "Dublin Bay South";
Document document = Jsoup.connect(URL).get();
String actual = document.getElementsByAttributeValue("href", "/constituency/dublin-bay-south").text();
Assertions.assertEquals(expected, actual);
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.