简体   繁体   English

JSOUP-尝试查找特定的href

[英]JSOUP- Trying to find a specific a href

在此处输入图片说明

I am trying to find the circled link as seen in the image above at imgur on a webpage.Currently(as seen below) I am just pulling all a hrefs from the document and looping through it looking for the one that contains "pdf" as it is the only one on the page,is there any way to just pull a href where title = "Download offers in store" or something like that 我正在尝试找到网页上imgur上图中的圆圈链接。当前(如下图所示)我只是从文档中拉出所有的href并循环遍历它,以查找包含“ pdf”的链接它是页面上唯一的一个,是否有任何方法可以拉到href为title =“在商店中下载商品”或类似的内容

Document doc = Jsoup.connect("http://www.dunnesstores.com/offer20/food-wine/fcp-category/home").get();
Elements links = doc.select("a[href]" + );

for (Element link : links) {
    System.out.println(link.attr("href"));
    if (link.attr("href").contains("pdf")){
        pdfLink = link.attr("href");
    }
}

You could specify a selector that matches a attribute and its value. 您可以指定一个与属性及其值匹配的选择器。

String pdfLink = null;

Document doc = Jsoup.connect("http://www.dunnesstores.com/offer20/food-wine/fcp-category/home").get();
Elements links = doc.select("a[title=\"Download offers in store\"]");

for (Element link : links) {
    pdfLink = link.attr("abs:href");
}

System.out.println(pdfLink);

This selects every a tag where the title attribute is equals Download offers in store. 这将选择标题属性等于“商店中的下载商品”的每个标签。

If you want to search the element by the file ending .pdf you could change the selector to: 如果要按.pdf结尾的文件搜索元素,则可以将选择器更改为:

a[href$=".pdf\"] 

https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors

doc.select("a[title='Download offers in store']");

[attr] Represents an element with an attribute name of attr. [attr]表示属性名称为attr的元素。

[attr=value] Represents an element with an attribute name of attr and whose value is exactly "value". [attr = value]表示属性名称为attr且其值恰好是“值”的元素。

[attr~=value] Represents an element with an attribute name of attr whose value is a whitespace-separated list of words, one of which is exactly "value". [attr〜= value]表示属性名称为attr的元素,其值是用空格分隔的单词列表,其中之一就是“值”。

[attr|=value] Represents an element with an attribute name of attr. [attr | = value]表示属性名称为attr的元素。 Its value can be exactly “value” or can begin with “value” immediately followed by “-” (U+002D). 其值可以完全是“值”,也可以以“值”开头,后跟“-”(U + 002D)。 It can be used for language subcode matches. 它可以用于语言子代码匹配。

[attr^=value] Represents an element with an attribute name of attr and whose first value is prefixed by "value". [attr ^ = value]表示属性名称为attr且其第一个值以“ value”为前缀的元素。 [attr$=value] Represents an element with an attribute name of attr and whose last value is suffixed by "value". [attr $ = value]表示一个元素,其属性名称为attr,其最后一个值后缀“ value”。

[attr*=value] Represents an element with an attribute name of attr and whose value contains at least one occurrence of string "value" as substring. [attr * = value]表示元素的属性名称为attr,并且其值包含至少一个出现的字符串“ value”作为子字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM