简体   繁体   中英

getting nonetype error with beautiful soup even though the object exists

i'm trying to scrape the webpage https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/

in this page there's a button to See All Vehicles and i'm tring to get the href for that tag.

so far i've made this work using selenium but opening a webdriver everytime takes too much time. i don't want to try selenium

while BeautifulSoup is showing nonetype error. my code is

import requests
from bs4 import BeautifulSoup
import re

base_url = 'https://www.cars.com/'

def request_page(url):
    session = requests.Session()
    my_headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0","Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"}
    response = session.get(url, headers=my_headers)
    soup = BeautifulSoup(re.sub("<!---->","", response.text), "lxml")
    return soup

def dealers_subpage(url):
    try:
        soup = request_page(url)
        descript = soup.find('dpp-update-inventory-link')
        print(descript.prettify())
        link = descript.find('a')['href']
        return base_url+str(link)
    except Exception as e:
        print(e,url)


dealers_subpage('https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/')

for this code i'm getting this message.

<dpp-update-inventory-link new-count="" party-id="74424458" used-count="100" zipcode="11763">
</dpp-update-inventory-link>

'NoneType' object is not subscriptable https://www.cars.com/dealers/5374692/carvana-touchless-delivery-to-your-home/
    

my question is why is it not reading the a tag which is present there.

note- use incognito/private mode to visit the webpage as in normal window it redirects to some other page

page is loading dynamic so you can not get a tag in dpp-update-inventory-link , even when you are printing descript.prettify() a is not present there so mean it rendering dynamically you have to use selenium .

just for currrent requiement for link, you can generate that link by your self because src for that link is using attribute of descript like party-id & zipcode so

def dealers_subpage(url):
   soup = request_page(url)
   descript = soup.find('dpp-update-inventory-link')
   party_id = descript['party-id']
   zipcode = descript['zipcode']
   url  = f"{base_url}/for-sale/searchresults.action/?dlId={party_id}&zc={zipcode}&searchSource=CAPTIVE_BLENDED"
   return url

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM