简体   繁体   English

如何从这个标签中获取href链接?

[英]How to get href link from in this a tag?

I successfully get href link from http://quotes.toscrape.com/ example by implementing:我通过实现从http://quotes.toscrape.com/示例成功获得了 href 链接:

response.css('div.quote > span > a::attr(href)').extract()

and it gives all partial link inside href of each a tag:它给出了每个 a 标签的 href 内的所有部分链接:

['/author/Albert-Einstein', '/author/J-K-Rowling', '/author/Albert-Einstein', '/author/Jane-Austen', '/author/Marilyn-Monroe', '/author/Albert-Einstein', '/author/Andre-Gide', '/author/Thomas-A-Edison', '/author/Eleanor-Roosevelt', '/author/Steve-Martin']

by the way in above example each a tag has this format:顺便说一下,在上面的示例中,每个 a 标签都具有以下格式:

<a href="/author/Albert-Einstein">(about)</a>

So I tried to make the same for this site: http://www.thegoodscentscompany.com/allproc-1.html The problem here is that the style of a tag is a bit different as such:所以我试图为这个网站做同样的事情: http : //www.thegoodscentscompany.com/allproc-1.html这里的问题是标签的样式有点不同:

<a href="#" onclick="openMainWindow('http://www.thegoodscentscompany.com/data/rw1247381.html');return false;">formaldehyde</a>

As you see I can't get link from href by using similar method above.如您所见,我无法使用上述类似方法从 href 获取链接。 I want to get link ( http://www.thegoodscentscompany.com/data/rw1247381.html ) from this a tag, but i could not make it.我想从这个标签中获取链接( http://www.thegoodscentscompany.com/data/rw1247381.html ),但我做不到。 How can i get this link?我怎样才能得到这个链接?

试试这个response.css('a::attr(onclick)').re(r"Window\\('(.*?)'\\)")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM