简体   繁体   English

漂亮的汤find_all找不到具有多个类的CSS选择器

[英]Beautiful soup find_all doesn't find CSS selector with multiple classes

On the website there is this <a> element 网站上有这个<a>元素

<a role="listitem" aria-level="1" href="https://www.rest.co.il" target="_blank" class="icon rest" title="this is main title" iconwidth="35px" aria-label="website connection" style="width: 30px; overflow: hidden;"></a>

So I use this code to catch the element 所以我用这段代码来捕捉元素
(note the find_all argument a.icon.rest ) (请注意find_all参数a.icon.rest

import requests
from bs4 import BeautifulSoup

url = 'http://www.zap.co.il/models.aspx?sog=e-cellphone&pageinfo=1'
source_code = requests.get(url)
plain_text = source_code.text
soup  = BeautifulSoup(plain_text, "html.parser")
for link in soup.find_all("a.icon.rest"):
    x = link.get('href')
    print(x)

Which unfortunately returns nothing 不幸的是哪一个都不返回
although the beautiful soup documentation clearly says: 尽管漂亮的汤文档清楚地表明:

If you want to search for tags that match two or more CSS classes, you should use a CSS selector: 如果要搜索与两个或多个CSS类匹配的标签,则应使用CSS选择器:

css_soup.select("p.strikeout.body") css_soup.select(“ p.strikeout.body”)
returns: <p class="body strikeout"></p>

So why isn't this working? 那为什么不起作用呢? By the way, I'm using pycharm 顺便说一句,我正在使用pycharm

As the docs you quoted explain, if you want to search for tags that match two CSS classes, you have to use a CSS selector instead of a find_all . 如您引用的文档所述,如果要搜索与两个CSS类匹配的标签,则必须使用CSS选择器而不是find_all The example you quoted shows how to do that: 您引用的示例显示了如何执行此操作:

css_soup.select("p.strikeout.body")

But you didn't do that; 但是你没有那样做。 you used find_all anyway, and of course it didn't work, because find_all doesn't take a CSS selector. 您仍然使用了find_all ,当然它没有用,因为find_all没有使用CSS选择器。

Change it to use select , which does take a CSS selector, and it will work. 将其更改为使用select ,它确实带有CSS选择器,并且可以工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM