漂亮的汤代码返回“AttributeError”

Question

I'm building a webscraper that returns the names of cafes written in the website like this: <h2 class="venue-title" itemprop="name">Prior</h2> However it is returning this error:我正在构建一个 webscraper，它返回写在网站上的咖啡馆的名称，如下所示： <h2 class="venue-title" itemprop="name">Prior</h2>但是它返回此错误：

"ResultSet object has no attribute '%s'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?" “ResultSet object 没有属性 '%s'。您可能将元素列表视为单个元素。当您打算调用 find() 时是否调用了 find_all()？” % key AttributeError: ResultSet object has no attribute 'text'. % key AttributeError: ResultSet object 没有属性“文本”。 You're probably treating a list of elements like a single element.您可能将元素列表视为单个元素。 Did you call find_all() when you meant to call find()?当您打算调用 find() 时，您是否调用了 find_all()？ [Finished in 0.699s] 【0.699s完成】

Here is the code:这是代码：

from bs4 import BeautifulSoup
import requests

url = 'https://www.broadsheet.com.au/melbourne/guides/best-cafes-thornbury'
response = requests.get(url, timeout=5)

soup_cafe_list = BeautifulSoup(response.content, "html.parser")
type(soup_cafe_list)

cafes = soup_cafe_list.findAll('h2', attrs_={"class":"venue-title"}).text
print(cafes)

I have tried a whole range of things to figure it out.我已经尝试了很多方法来解决这个问题。 I feel it has something to do with the findAll arg: cafes = soup_cafe_list.findAll('h2', attrs_={"class":"venue-title"}).text because when I run it as cafes = soup_cafe_list.findAll('h2', class_="venue-title") instead, it sort of works expect doesn't return the items cleaned of their html which I believe .text should do?我觉得它与 findAll 参数有关： cafes = soup_cafe_list.findAll('h2', attrs_={"class":"venue-title"}).text因为当我将它作为cafes = soup_cafe_list.findAll('h2', class_="venue-title")相反，它的某种作品期望不会返回从 html 清除的项目，我认为.text应该这样做？

Another thing I'm noticing in the traceback is that it may be referring to a different directory for BS4?我在回溯中注意到的另一件事是它可能指的是 BS4 的不同目录？ Could this have anything to do with it, I started off using Jupyter and now am on Atom, but may have incorrectly installed bs4:这与它有什么关系吗，我开始使用 Jupyter，现在在 Atom 上，但可能错误地安装了 bs4：

File "/Users/[xxxxxxxx]/Desktop/Coding/amvpscraper/webscraper.py", line 10, in cafes = soup_cafe_list.findAll('h2', attrs_={"class":"venue-title"}).text File "/Users/[xxxxxxxx]/opt/anaconda3/lib/python3.7/site-packages/bs4/element.py", line 2081, in getattr文件“/Users/[xxxxxxxx]/Desktop/Coding/amvpscraper/webscraper.py”，第 10 行，cafes = soup_cafe_list.findAll('h2', attrs_={"class":"venue-title"}).text文件“/Users/[xxxxxxxx]/opt/anaconda3/lib/python3.7/site-packages/bs4/element.py”，第 2081 行，在getattr

Not sure if I am doing something else wrong...不知道我是否做错了什么......

Answer 1

The error indicates that the return value of the findAll method is a list of elements and does not have a text attribute.该错误表明 findAll 方法的返回值是一个元素列表，并且没有 text 属性。 Save the result in a list ( without.text method ) and replace attrs_ with attrs:将结果保存在列表中（without.text 方法）并将 attrs_ 替换为 attrs：

cafes = soup_cafe_list.findAll('h2', attrs={"class":"venue-title"})

and then iterate through list and get the text.然后遍历列表并获取文本。 You can do that with a list comprehension:您可以通过列表理解来做到这一点：

cafes = [el.text for el in cafes]

Edit : List comprehensions simplify a for loop.编辑：列表推导简化了 for 循环。 You could also write:你也可以写：

res_list = []
for el in cafes:
    res_list.append(el.text)

Additionally, you may add some try-except clause or a check for a valid text field within the loop to catch possible elements without a text.此外，您可以添加一些 try-except 子句或检查循环中的有效文本字段以捕获可能没有文本的元素。

Output: Output：

['Prior',
 'Rat the Cafe',
 'Ampersand Coffee and Food',
 'Umberto Espresso Bar',
 'Brother Alec',
 'Short Round',
 'Jerry Joy',
 'The Old Milk Bar',
 'Little Henri',
 'Northern Soul']

漂亮的汤代码返回“AttributeError”

问题描述

1 个解决方案

解决方案1
1 2020-07-11 12:19:41

漂亮的汤代码返回“AttributeError”

问题描述

1 个解决方案

解决方案1 1 2020-07-11 12:19:41

解决方案1
1 2020-07-11 12:19:41