[英]JSOUP- Trying to find a specific a href
I am trying to find the circled link as seen in the image above at imgur on a webpage.Currently(as seen below) I am just pulling all a hrefs from the document and looping through it looking for the one that contains "pdf" as it is the only one on the page,is there any way to just pull a href where title = "Download offers in store" or something like that 我正在尝试找到网页上imgur上图中的圆圈链接。当前(如下图所示)我只是从文档中拉出所有的href并循环遍历它,以查找包含“ pdf”的链接它是页面上唯一的一个,是否有任何方法可以拉到href为title =“在商店中下载商品”或类似的内容
Document doc = Jsoup.connect("http://www.dunnesstores.com/offer20/food-wine/fcp-category/home").get();
Elements links = doc.select("a[href]" + );
for (Element link : links) {
System.out.println(link.attr("href"));
if (link.attr("href").contains("pdf")){
pdfLink = link.attr("href");
}
}
You could specify a selector that matches a attribute and its value. 您可以指定一个与属性及其值匹配的选择器。
String pdfLink = null;
Document doc = Jsoup.connect("http://www.dunnesstores.com/offer20/food-wine/fcp-category/home").get();
Elements links = doc.select("a[title=\"Download offers in store\"]");
for (Element link : links) {
pdfLink = link.attr("abs:href");
}
System.out.println(pdfLink);
This selects every a tag where the title attribute is equals Download offers in store. 这将选择标题属性等于“商店中的下载商品”的每个标签。
If you want to search the element by the file ending .pdf you could change the selector to: 如果要按.pdf结尾的文件搜索元素,则可以将选择器更改为:
a[href$=".pdf\"]
https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors
doc.select("a[title='Download offers in store']");
[attr] Represents an element with an attribute name of attr.
[attr]表示属性名称为attr的元素。
[attr=value] Represents an element with an attribute name of attr and whose value is exactly "value".
[attr = value]表示属性名称为attr且其值恰好是“值”的元素。
[attr~=value] Represents an element with an attribute name of attr whose value is a whitespace-separated list of words, one of which is exactly "value".
[attr〜= value]表示属性名称为attr的元素,其值是用空格分隔的单词列表,其中之一就是“值”。
[attr|=value] Represents an element with an attribute name of attr.
[attr | = value]表示属性名称为attr的元素。 Its value can be exactly “value” or can begin with “value” immediately followed by “-” (U+002D).
其值可以完全是“值”,也可以以“值”开头,后跟“-”(U + 002D)。 It can be used for language subcode matches.
它可以用于语言子代码匹配。
[attr^=value] Represents an element with an attribute name of attr and whose first value is prefixed by "value".
[attr ^ = value]表示属性名称为attr且其第一个值以“ value”为前缀的元素。 [attr$=value] Represents an element with an attribute name of attr and whose last value is suffixed by "value".
[attr $ = value]表示一个元素,其属性名称为attr,其最后一个值后缀“ value”。
[attr*=value] Represents an element with an attribute name of attr and whose value contains at least one occurrence of string "value" as substring.
[attr * = value]表示元素的属性名称为attr,并且其值包含至少一个出现的字符串“ value”作为子字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.