[英]getting nonetype error with beautiful soup even though the object exists
i'm trying to scrape the webpage https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/我正在尝试抓取网页https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/
in this page there's a button to See All Vehicles and i'm tring to get the href for that tag.在此页面中,有一个查看所有车辆的按钮,我正在尝试获取该标签的 href。
so far i've made this work using selenium but opening a webdriver everytime takes too much time.到目前为止,我已经使用 selenium 完成了这项工作,但每次打开 webdriver 都需要太多时间。 i don't want to try selenium
我不想尝试 selenium
while BeautifulSoup is showing nonetype error.而 BeautifulSoup 显示无类型错误。 my code is
我的代码是
import requests
from bs4 import BeautifulSoup
import re
base_url = 'https://www.cars.com/'
def request_page(url):
session = requests.Session()
my_headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"}
response = session.get(url, headers=my_headers)
soup = BeautifulSoup(re.sub("<!---->","", response.text), "lxml")
return soup
def dealers_subpage(url):
try:
soup = request_page(url)
descript = soup.find('dpp-update-inventory-link')
print(descript.prettify())
link = descript.find('a')['href']
return base_url+str(link)
except Exception as e:
print(e,url)
dealers_subpage('https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/')
for this code i'm getting this message.对于此代码,我收到此消息。
<dpp-update-inventory-link new-count="" party-id="74424458" used-count="100" zipcode="11763">
</dpp-update-inventory-link>
'NoneType' object is not subscriptable https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/
my question is why is it not reading the a tag which is present there.我的问题是为什么它不读取那里存在的 a 标签。
note- use incognito/private mode to visit the webpage as in normal window it redirects to some other page注意-使用隐身/私人模式访问网页,就像正常 window 它重定向到其他页面
page is loading dynamic so you can not get a
tag in dpp-update-inventory-link
, even when you are printing descript.prettify()
a
is not present there so mean it rendering dynamically you have to use selenium
.页面正在动态加载,因此您无法
a
dpp-update-inventory-link
中获取标签,即使您正在打印descript.prettify()
a
也不存在因此意味着它动态呈现您必须使用selenium
。
just for currrent requiement for link, you can generate that link by your self because src
for that link is using attribute of descript
like party-id
& zipcode
so只是为了当前对链接的要求,您可以自己生成该链接,因为该链接的
src
使用了descript
party-id
和zipcode
,所以
def dealers_subpage(url):
soup = request_page(url)
descript = soup.find('dpp-update-inventory-link')
party_id = descript['party-id']
zipcode = descript['zipcode']
url = f"{base_url}/for-sale/searchresults.action/?dlId={party_id}&zc={zipcode}&searchSource=CAPTIVE_BLENDED"
return url
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.