简体   繁体   English

在 python 中使用美丽的汤和请求时 web 抓取错误

[英]web scraping error while using Beautiful soup and requests in python

I'm trying to write the code for tracking the amazon price of a product.The code is below我正在尝试编写用于跟踪产品亚马逊价格的代码。代码如下

import requests
from bs4 import BeautifulSoup
url='https://www.amazon.com/LunaJany-Womens-Striped-Office-Career/dp/B01DPLT4AC/ref=sxin_7_ac_d_rm?ac_md=2-2-ZHJlc3NlcyBmb3Igd29tZW4gd29yayBjYXN1YWw%3D-ac_d_rm&crid=1POYCFAFYAR8B&cv_ct_cx=dresses+for+women+casual+summer&dchild=1&keywords=dresses+for+women+casual+summer&pd_rd_i=B01DPLT4AC&pd_rd_r=0b613dda-1077-46d2-b403-af7e15840645&pd_rd_w=7Mp2P&pd_rd_wg=rNofK&pf_rd_p=a0516f22-66df-4efd-8b9a-279a864d1512&pf_rd_r=1P30PXW75XA27N3M6VDK&psc=1&qid=1592310609&sprefix=dre%2Caps%2C440&sr=1-3-12d4272d-8adb-4121-8624-135149aa9081'
        header={"user-agent":'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'}
        page=requests.get(url,headers=header)
        soup1=BeautifulSoup(page.content,"html.parser")
        soup2=BeautifulSoup(soup1.prettify(),"html.parser")
        title=soup2.find(id="productTitle").get_text()
        print(title)

While trying to print the title I'm getting error as尝试打印标题时出现错误

Traceback (most recent call last):
  File "C:/Users/Patterns/PycharmProjects/RUBI/Tracks amozon prices.py", line 8, in <module>
    title=soup2.find(id="productTitle").getText()
AttributeError: 'NoneType' object has no attribute 'getText'

Could anyone help me out??谁能帮帮我??

It says NoneType has no attribute "get_text" , implying that no matching element with the id "productTitle" was found, hence returning None .它说NoneType没有属性"get_text" ,这意味着没有找到 id 为"productTitle"的匹配元素,因此返回None None is a NoneType object has therefore has no "get_text" attribute. NoneNoneType object 因此没有"get_text"属性。

Tip - Try tweaking productTitle .提示 - 尝试调整productTitle I am not sure, but it might not be the element for price of the item you are trying to track the price of.我不确定,但它可能不是您尝试跟踪价格的商品价格的要素。

Its because there is no such element like productTitle .这是因为没有像productTitle这样的元素。 When you use beautifull soup to load the page content there is a robot check page that loads instead.当您使用 beautifull soup 加载页面内容时,会加载一个机器人检查页面。 Like this one.像这个。 Robot Check机器人检查

Just try to print page content by - print(soup2) you will get to know the reason of error.只需尝试通过 - print(soup2)打印页面内容,您就会知道错误的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM