简体   繁体   中英

Receiving an error in BS4 while amazon web scraping : AttributeError: 'NoneType' object has no attribute 'get_text'

!pip install requests
!pip install bs4


import requests
from bs4 import BeautifulSoup

url = "https://www.amazon.in/Apple-iPhone-Pro-Max-256GB/dp/B07XVLH744/ref=sr_1_1_sspa?crid=2VCKZNOH3H6SR&keywords=apple+iphone+11+pro+max&qid=1582043410&sprefix=apple+iphone%2Caps%2C388&sr=8-1-spons&psc=1&spLa=ZW5jcnlwdGVkUXVhbGlmaWVyPUEyVjdZSE83TzU4UUMmZW5jcnlwdGVkSWQ9QTAyNTI1ODZJUzZOVUwxWDNIUlAmZW5jcnlwdGVkQWRJZD1BMDkxNDg4MzFLMFpVT1M5OFM5Q0smd2lkZ2V0TmFtZT1zcF9hdGYmYWN0aW9uPWNsaWNrUmVkaXJlY3QmZG9Ob3RMb2dDbGljaz10cnVl"

headers = {"User-Agent": "in this section im adding my user agent after typing my user agent in google search"}

page = requests.get(url, headers=headers)

soup = BeautifulSoup(page.content, "html.parser")

print(soup.prettify()) 

title = soup.find(id = "productTitle").get_text()
price = soup.find(id = "priceblock_ourprice").get_text()

converted_price = price[0:8]

print(converted_price)
print(titles)

i am working on google colab when i run this code i get this error

AttributeError   Traceback (most recent call last)
<ipython-input-15-14696d9dc778> in <module>()
     16 print(soup.prettify())
     17 
---> 18 title = soup.find(id = "productTitle").get_text()
     19 price = soup.find(id = "priceblock_ourprice").get_text()
     20 

AttributeError: 'NoneType' object has no attribute 'get_text'

i have tried searching all over internet but have not found answer addressing my question. i am trying to get iPhone 11 pro max price. when i run this code i get the error mentioned above.

  • soup.find(id = "productTitle") This is returning None Because its not able to find id = "producTitle" . Make sure you are searching for correct element.

  • For find statements i would suggest always write if condition to avoid and handle this kind of errors.

title = soup.find(id = "productTitle")
if title:
    title = title.get_text()
else:
    title = "default_title"

price = soup.find(id = "priceblock_ourprice").get_text()
  • you can do same with price .

You get that error when you're trying to pull data out of an object whose value is None. If you're seeing that on line 18, it means your soup.find(id = "productTitle") did not match anything and returned None.

You need to break down your processing into steps. Check for the return value first before accessing it. So...

title_info = soup.find(id = "productTitle")
if title_info:
    title = title_info.text
else:
    'handle the situation'

Well, I tested your code here it is working normally. However Amazon gives you a 503 code when you try to access the same link in a short time ...

<html>
 <head>
  <title>
   503 - Service Unavailable Error
  </title>
 </head>
 <body bgcolor="#FFFFFF" text="#000000">
  <!--
        To discuss automated access to Amazon data please contact api-services-support@amazon.com.
        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.in/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.in/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
  <center>
   <a href="https://www.amazon.in/ref=cs_503_logo/">
    <img alt="Amazon.in" border="0" height="45" src="https://images-eu.ssl-images-amazon.com/images/G/31/x-locale/communities/people/logo.gif" width="200"/>
   </a>
   <p align="center">
    <font face="Verdana,Arial,Helvetica">
     <font color="#CC6600" size="+2">
      <b>
       Oops!
      </b>
     </font>
     <br/>
     <b>
      It's rush hour and traffic is piling up on that page. Please try again in a short while.
      <br/>
      If you were trying to place an order, it will not have been processed at this time.
     </b>
     <p>
      <img alt="*" border="0" height="9" src="https://images-eu.ssl-images-amazon.com/images/G/02/x-locale/common/orange-arrow.gif" width="10"/>
      <b>
       <a href="https://www.amazon.in/ref=cs_503_link/">
        Go to the Amazon.in home page to continue shopping
       </a>
      </b>
     </p>
    </font>
   </p>
  </center>
 </body>
</html>

Wait a while before you can try again, or at least test with a longer time between requests ...

try this code also

    title = soup.find(id="productTitle")
     if title:
       title = title.get_text()
     else:
       title = "default_title"
    price = soup.find(id="priceblock_ourprice")
      if price:
       price = price
      else:
       price = "default_title"

        # converted_price = price[0:8]
       convert = str(price)
       con = convert[-18:-11]

        print(con)
        print(title)

try to use another IDE

Use repl.it= https://repl.it create a new repl and use it

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM