简体   繁体   English

在亚马逊网页抓取时在 BS4 中收到错误:AttributeError: 'NoneType' 对象没有属性 'get_text'

[英]Receiving an error in BS4 while amazon web scraping : AttributeError: 'NoneType' object has no attribute 'get_text'

!pip install requests
!pip install bs4


import requests
from bs4 import BeautifulSoup

url = "https://www.amazon.in/Apple-iPhone-Pro-Max-256GB/dp/B07XVLH744/ref=sr_1_1_sspa?crid=2VCKZNOH3H6SR&keywords=apple+iphone+11+pro+max&qid=1582043410&sprefix=apple+iphone%2Caps%2C388&sr=8-1-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEyVjdZSE83TzU4UUMmZW5jcnlwdGVkSWQ9QTAyNTI1ODZJUzZOVUwxWDNIUlAmZW5jcnlwdGVkQWRJZD1BMDkxNDg4MzFLMFpVT1M5OFM5Q0smd2lkZ2V0TmFtZT1zcF9hdGYmYWN0aW9uPWNsaWNrUmVkaXJlY3QmZG9Ob3RMb2dDbGljaz10cnVl"

headers = {"User-Agent": "in this section im adding my user agent after typing my user agent in google search"}

page = requests.get(url, headers=headers)

soup = BeautifulSoup(page.content, "html.parser")

print(soup.prettify()) 

title = soup.find(id = "productTitle").get_text()
price = soup.find(id = "priceblock_ourprice").get_text()

converted_price = price[0:8]

print(converted_price)
print(titles)

i am working on google colab when i run this code i get this error当我运行此代码时,我正在使用 google colab 出现此错误

AttributeError   Traceback (most recent call last)
<ipython-input-15-14696d9dc778> in <module>()
     16 print(soup.prettify())
     17 
---> 18 title = soup.find(id = "productTitle").get_text()
     19 price = soup.find(id = "priceblock_ourprice").get_text()
     20 

AttributeError: 'NoneType' object has no attribute 'get_text'

i have tried searching all over internet but have not found answer addressing my question.我试过在互联网上搜索,但没有找到解决我问题的答案。 i am trying to get iPhone 11 pro max price.我想获得 iPhone 11 pro 的最高价格。 when i run this code i get the error mentioned above.当我运行此代码时,出现上述错误。

  • soup.find(id = "productTitle") This is returning None Because its not able to find id = "producTitle" . soup.find(id = "productTitle")这是返回None因为它无法找到id = "producTitle" Make sure you are searching for correct element.确保您正在搜索正确的元素。

  • For find statements i would suggest always write if condition to avoid and handle this kind of errors.对于find语句,我建议始终编写 if 条件来避免和处理此类错误。

title = soup.find(id = "productTitle")
if title:
    title = title.get_text()
else:
    title = "default_title"

price = soup.find(id = "priceblock_ourprice").get_text()
  • you can do same with price .你可以对price做同样的事情。

You get that error when you're trying to pull data out of an object whose value is None.当您尝试从值为 None 的对象中提取数据时,您会收到该错误。 If you're seeing that on line 18, it means your soup.find(id = "productTitle") did not match anything and returned None.如果您在第 18 行看到它,则表示您的soup.find(id = "productTitle")没有匹配任何内容并返回 None。

You need to break down your processing into steps.您需要将处理分解为多个步骤。 Check for the return value first before accessing it.在访问它之前首先检查返回值。 So...所以...

title_info = soup.find(id = "productTitle")
if title_info:
    title = title_info.text
else:
    'handle the situation'

Well, I tested your code here it is working normally.好吧,我在这里测试了您的代码,它工作正常。 However Amazon gives you a 503 code when you try to access the same link in a short time ...但是,当您尝试在短时间内访问同一链接时,亚马逊会为您提供 503 代码......

<html>
 <head>
  <title>
   503 - Service Unavailable Error
  </title>
 </head>
 <body bgcolor="#FFFFFF" text="#000000">
  <!--
        To discuss automated access to Amazon data please contact api-services-support@amazon.com.
        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.in/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.in/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
  <center>
   <a href="https://www.amazon.in/ref=cs_503_logo/">
    <img alt="Amazon.in" border="0" height="45" src="https://images-eu.ssl-images-amazon.com/images/G/31/x-locale/communities/people/logo.gif" width="200"/>
   </a>
   <p align="center">
    <font face="Verdana,Arial,Helvetica">
     <font color="#CC6600" size="+2">
      <b>
       Oops!
      </b>
     </font>
     <br/>
     <b>
      It's rush hour and traffic is piling up on that page. Please try again in a short while.
      <br/>
      If you were trying to place an order, it will not have been processed at this time.
     </b>
     <p>
      <img alt="*" border="0" height="9" src="https://images-eu.ssl-images-amazon.com/images/G/02/x-locale/common/orange-arrow.gif" width="10"/>
      <b>
       <a href="https://www.amazon.in/ref=cs_503_link/">
        Go to the Amazon.in home page to continue shopping
       </a>
      </b>
     </p>
    </font>
   </p>
  </center>
 </body>
</html>

Wait a while before you can try again, or at least test with a longer time between requests ...稍等片刻,然后再试一次,或者至少测试请求之间的时间更长......

try this code also也试试这个代码

    title = soup.find(id="productTitle")
     if title:
       title = title.get_text()
     else:
       title = "default_title"
    price = soup.find(id="priceblock_ourprice")
      if price:
       price = price
      else:
       price = "default_title"

        # converted_price = price[0:8]
       convert = str(price)
       con = convert[-18:-11]

        print(con)
        print(title)

try to use another IDE尝试使用另一个 IDE

Use repl.it= https://repl.it create a new repl and use it使用 repl.it= https://repl.it创建一个新的 repl 并使用它

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 AttributeError: &#39;NoneType&#39; 对象在 beautifulsoop web-scraping 中没有属性 &#39;get_text&#39; - AttributeError: 'NoneType' object has no attribute 'get_text' in beautifulsoop web-scraping AttributeError: &#39;NoneType&#39; 对象没有属性 &#39;get_text&#39; python web-scraping - AttributeError: 'NoneType' object has no attribute 'get_text' python web-scraping 用 python 抓取网页(&#39;NoneType&#39; 对象没有属性 &#39;get_text&#39;) - Web scraping with python ('NoneType' object has no attribute 'get_text') BS4 - “AttributeError: 'NoneType' object 没有属性 'text'” - BS4 - “AttributeError: 'NoneType' object has no attribute 'text'” Python BS4 抓取:AttributeError:&#39;NavigableString&#39; 对象没有属性 &#39;text&#39; - Python BS4 scraping: AttributeError: 'NavigableString' object has no attribute 'text' Python-AttributeError:“ NoneType”对象没有属性“ get_text” - Python - AttributeError: 'NoneType' object has no attribute 'get_text' AttributeError: &#39;NoneType&#39; 对象没有属性 &#39;get_text&#39; - AttributeError: 'NoneType' object has no attribute 'get_text' AttributeError: 'NoneType' object 没有带有输入 id 的属性 'get_text' - AttributeError: 'NoneType' object has no attribute 'get_text' with input id “AttributeError: 'NoneType' object 没有属性 'get_text'” - "AttributeError: 'NoneType' object has no attribute 'get_text'" BS4 返回 AttributeError: &#39;NoneType&#39; 对象有时没有属性 &#39;text&#39;,如何解决这个问题? - BS4 returning AttributeError: 'NoneType' object has no attribute 'text' sometimes, how to solve this?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM