美丽的汤和Python导致的属性错误

Question

I had a working piece of code, and then I run it today and it's broken. 我有一段有效的代码，然后我今天运行了它，但它坏了。 I have pulled out the relevant section that is giving me problems. 我已经删除了给我带来麻烦的相关部分。

from bs4 import BeautifulSoup
import requests

webpage = requests.get('http://www.bbcgoodfood.com/search/recipes?query=')

soup = BeautifulSoup(webpage.content) 
links = soup.find("div",{"class":"main row grid-padding"}).find_all("h2",{"class":"node-title"})

for link in links:
    print(link.a["href"])

This gives me an error "Attribute Error: 'NoneType' object has no attribute 'find_all'" 这给我一个错误“属性错误：'NoneType'对象没有属性'find_all'”

What precisely is this error telling me? 这个错误告诉我什么呢？

find_all() is a valid command in the beautiful soup documentation. find_all（）是漂亮的汤文档中的有效命令。 Looking through the webpage's source code, my path to my desired object seems to make sense. 通过浏览网页的源代码，我通往所需对象的路径似乎很有意义。

I think something must have changed with the website, because I don't see how my code could just stop working. 我认为该网站一定有所更改，因为我看不到我的代码将如何停止工作。 But I don't understand the error message that well... 但是我不太理解错误消息...

Thanks for any help you can give! 谢谢你提供的所有帮助！

Answer 1

The site you are trying to parse doesn't "like" your user agent and returns 403 error,then parser fails since it cannot find the div . 您尝试解析的站点不喜欢您的用户代理，并返回403错误，然后解析器失败，因为找不到div 。 Try to change user-agent to an user-agent of one of the browsers: 尝试将用户代理更改为浏览器之一的用户代理：

webpage = requests.get('http://www.bbcgoodfood.com/search/recipes?query=', headers = {'user-agent':'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'})

Answer 2

This is because when you tried to access the page, it gives you permission denied , so the soup.find() returns nothing None , and None has no attribute of find_all() , this gives you an AttributeError . 这是因为，当您尝试访问该页面时，它会给您permission denied ，因此soup.find() None返回None ，并且None没有find_all()的AttributeError ，这会给您AttributeError 。

from bs4 import BeautifulSoup
import requests

webpage = requests.get('http://www.bbcgoodfood.com/search/recipes?query=')


print webpage.content
<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;www&#46;bbcgoodfood&#46;com&#47;search&#47;recipes&#63;" on this server.<P>
Reference&#32;&#35;18&#46;4fa9cd17&#46;1428789762&#46;680369dc
</BODY>
</HTML>

If you resolve this by adding a header with proper user agent like @Vader suggested, your code will then run fine: 如果您通过添加带有适当用户代理（如建议的@Vader）的标头来解决此问题，则代码将正常运行：

...
headers = {'User-agent': 'Mozilla/5.0'}
webpage = requests.get('http://www.bbcgoodfood.com/search/recipes?query=', headers=headers)

soup = BeautifulSoup(webpage.content) 
links = soup.find("div",{"class":"main row grid-padding"}).find_all("h2",{"class":"node-title"})

for link in links:
    print(link.a["href"])

/recipes/4942/lemon-drizzle-cake
/recipes/3092/ultimate-chocolate-cake
/recipes/3228/chilli-con-carne
/recipes/3229/yummy-scrummy-carrot-cake
/recipes/1223/bestever-brownies
/recipes/1167651/chicken-and-chorizo-jambalaya
/recipes/2089/spiced-carrot-and-lentil-soup
/recipes/1521/summerinwinter-chicken
/recipes/1364/spicy-root-and-lentil-casserole
/recipes/4814/mustardstuffed-chicken
/recipes/4622/classic-scones-with-jam-and-clotted-cream
/recipes/333614/red-lentil-chickpea-and-chilli-soup
/recipes/5605/falafel-burgers
/recipes/11695/raspberry-bakewell-cake
/recipes/4686/chicken-biryani

美丽的汤和Python导致的属性错误

问题描述

2 个解决方案

解决方案1
2 2015-04-11 22:03:11

解决方案2
1 已采纳 2015-04-11 22:03:01

美丽的汤和Python导致的属性错误

问题描述

2 个解决方案

解决方案1 2 2015-04-11 22:03:11

解决方案2 1 已采纳 2015-04-11 22:03:01

解决方案1
2 2015-04-11 22:03:11

解决方案2
1 已采纳 2015-04-11 22:03:01