简体   繁体   English

美丽的汤findall返回不同的结果

[英]beautiful soup findall returning different results

I am trying to parse through a div class from an html table on Amazon, and when I run the code, find_all() sometimes returns the right div classes that I am looking for, and other times it will return an empty list. 我试图从Amazon上的html表中通过div类解析,当我运行代码时, find_all()有时会返回我要查找的正确的div类,而其他时候它将返回一个空列表。 Any ideas on why the results vary? 关于结果为何有所不同的任何想法?

I am pulling from this url: https://www.amazon.com/dp/B0767653BK 我正在从此URL中提取: https : //www.amazon.com/dp/B0767653BK

My code: 我的代码:

req = requests.get('https://www.amazon.com/dp/B0767653BK')
page = req.text
BSoup = BeautifulSoup(page, 'html.parser')
divClass = Bsoup.find_all('div', class_='a-section a-spacing-none a-padding-none overflow_ellipsis')

It is better to use a beautifulsoup selector when trying to find all elements with a combination of CSS classes: 尝试通过CSS类组合查找所有元素时,最好使用beautifulsoup选择器:

from bs4 import BeautifulSoup
import requests

req = requests.get('https://www.amazon.com/dp/B0767653BK')
soup = BeautifulSoup(req.text, 'html.parser')

for div_class in soup.select('div.a-section.a-spacing-none.a-padding-none.overflow_ellipsis'):
    print div_class.get_text(strip=True)

This is preferable as it allows the four class elements to be present in any order. 这是优选的,因为它允许四个类元素以任何顺序出现。 So if the page decides to change the ordering of the classes, it will still find them. 因此,如果页面决定更改类的顺序,则仍会找到它们。

Take a look at Searching by CSS class in the documenation. 看一下文档中的按CSS类搜索

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM