[英]Receiving an error in BS4 while amazon web scraping : AttributeError: 'NoneType' object has no attribute 'get_text'
!pip install requests
!pip install bs4
import requests
from bs4 import BeautifulSoup
url = "https://www.amazon.in/Apple-iPhone-Pro-Max-256GB/dp/B07XVLH744/ref=sr_1_1_sspa?crid=2VCKZNOH3H6SR&keywords=apple+iphone+11+pro+max&qid=1582043410&sprefix=apple+iphone%2Caps%2C388&sr=8-1-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEyVjdZSE83TzU4UUMmZW5jcnlwdGVkSWQ9QTAyNTI1ODZJUzZOVUwxWDNIUlAmZW5jcnlwdGVkQWRJZD1BMDkxNDg4MzFLMFpVT1M5OFM5Q0smd2lkZ2V0TmFtZT1zcF9hdGYmYWN0aW9uPWNsaWNrUmVkaXJlY3QmZG9Ob3RMb2dDbGljaz10cnVl"
headers = {"User-Agent": "in this section im adding my user agent after typing my user agent in google search"}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, "html.parser")
print(soup.prettify())
title = soup.find(id = "productTitle").get_text()
price = soup.find(id = "priceblock_ourprice").get_text()
converted_price = price[0:8]
print(converted_price)
print(titles)
i am working on google colab when i run this code i get this error当我运行此代码时,我正在使用 google colab 出现此错误
AttributeError Traceback (most recent call last)
<ipython-input-15-14696d9dc778> in <module>()
16 print(soup.prettify())
17
---> 18 title = soup.find(id = "productTitle").get_text()
19 price = soup.find(id = "priceblock_ourprice").get_text()
20
AttributeError: 'NoneType' object has no attribute 'get_text'
i have tried searching all over internet but have not found answer addressing my question.我试过在互联网上搜索,但没有找到解决我问题的答案。 i am trying to get iPhone 11 pro max price.
我想获得 iPhone 11 pro 的最高价格。 when i run this code i get the error mentioned above.
当我运行此代码时,出现上述错误。
soup.find(id = "productTitle")
This is returning None
Because its not able to find id = "producTitle"
. soup.find(id = "productTitle")
这是返回None
因为它无法找到id = "producTitle"
。 Make sure you are searching for correct element.确保您正在搜索正确的元素。
For find
statements i would suggest always write if condition to avoid and handle this kind of errors.对于
find
语句,我建议始终编写 if 条件来避免和处理此类错误。
title = soup.find(id = "productTitle")
if title:
title = title.get_text()
else:
title = "default_title"
price = soup.find(id = "priceblock_ourprice").get_text()
price
.price
做同样的事情。You get that error when you're trying to pull data out of an object whose value is None.当您尝试从值为 None 的对象中提取数据时,您会收到该错误。 If you're seeing that on line 18, it means your
soup.find(id = "productTitle")
did not match anything and returned None.如果您在第 18 行看到它,则表示您的
soup.find(id = "productTitle")
没有匹配任何内容并返回 None。
You need to break down your processing into steps.您需要将处理分解为多个步骤。 Check for the return value first before accessing it.
在访问它之前首先检查返回值。 So...
所以...
title_info = soup.find(id = "productTitle")
if title_info:
title = title_info.text
else:
'handle the situation'
Well, I tested your code here it is working normally.好吧,我在这里测试了您的代码,它工作正常。 However Amazon gives you a 503 code when you try to access the same link in a short time ...
但是,当您尝试在短时间内访问同一链接时,亚马逊会为您提供 503 代码......
<html>
<head>
<title>
503 - Service Unavailable Error
</title>
</head>
<body bgcolor="#FFFFFF" text="#000000">
<!--
To discuss automated access to Amazon data please contact api-services-support@amazon.com.
For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.in/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.in/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
<center>
<a href="https://www.amazon.in/ref=cs_503_logo/">
<img alt="Amazon.in" border="0" height="45" src="https://images-eu.ssl-images-amazon.com/images/G/31/x-locale/communities/people/logo.gif" width="200"/>
</a>
<p align="center">
<font face="Verdana,Arial,Helvetica">
<font color="#CC6600" size="+2">
<b>
Oops!
</b>
</font>
<br/>
<b>
It's rush hour and traffic is piling up on that page. Please try again in a short while.
<br/>
If you were trying to place an order, it will not have been processed at this time.
</b>
<p>
<img alt="*" border="0" height="9" src="https://images-eu.ssl-images-amazon.com/images/G/02/x-locale/common/orange-arrow.gif" width="10"/>
<b>
<a href="https://www.amazon.in/ref=cs_503_link/">
Go to the Amazon.in home page to continue shopping
</a>
</b>
</p>
</font>
</p>
</center>
</body>
</html>
Wait a while before you can try again, or at least test with a longer time between requests ...稍等片刻,然后再试一次,或者至少测试请求之间的时间更长......
try this code also也试试这个代码
title = soup.find(id="productTitle")
if title:
title = title.get_text()
else:
title = "default_title"
price = soup.find(id="priceblock_ourprice")
if price:
price = price
else:
price = "default_title"
# converted_price = price[0:8]
convert = str(price)
con = convert[-18:-11]
print(con)
print(title)
try to use another IDE尝试使用另一个 IDE
Use repl.it= https://repl.it create a new repl and use it使用 repl.it= https://repl.it创建一个新的 repl 并使用它
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.