简体   繁体   English

空单 Python 美汤

[英]Empty List Python Beautiful Soup

I am new to web scraping.我是 web 抓取的新手。 I am trying to extract information regarding car listings.我正在尝试提取有关汽车列表的信息。 However, when I run the below code I only get empty lists.但是,当我运行以下代码时,我只会得到空列表。

import requests
from requests import get
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np

from time import sleep
from random import randint

title=[]
kilometres=[]
transmission=[]
engine=[]
price=[]
adtype=[]

url='https://www.carsales.com.au/cars/new-south-wales-state/sydney-metro-region/suv-bodystyle/?offset=0'
headers = {"Accept-Language": "en-AU, en;q=0.5"}
page=requests.get(url,headers=headers)
soup=BeautifulSoup(page.text,'html.parser')

names=soup.find_all(class_='col')
for item in names:
    title.append(item.find('a').txt)

distances=soup.find_all('li',{'data-type':'Odometer'})
for item in distances:
    kilometres.append(item.text)

trans=soup.find_all('li',{'data-type':'Transmission'})
for item in trans:
    transmission.append(item.text)

engines=soup.find_all('li',{'data-type':'Engine'})
for item in engines:
    engine.append(item.text)

prices=soup.find_all(class_='price')
for item in prices:
    price.append(item.find('a').text)

adtypes=soup.find_all(class_='seller-type')
for item in adtypes:
    adtype.append(item.text)

What am I doing wrong here?我在这里做错了什么? I want to scrape the data from the URL into a Pandas Dataframe.我想将 URL 中的数据刮到 Pandas Dataframe 中。

To get correct page set User-Agent header and Accept-Language to "en-US,en;q=0.5" :要获得正确的页面,请将User-Agent header 和Accept-Language设置为"en-US,en;q=0.5"

import requests
import pandas as pd
from bs4 import BeautifulSoup

url='https://www.carsales.com.au/cars/new-south-wales-state/sydney-metro-region/suv-bodystyle/?offset=0'
headers = {"Accept-Language": "en-US,en;q=0.5", 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
page=requests.get(url,headers=headers)
soup=BeautifulSoup(page.text,'html.parser')

all_data = []

for car in soup.select('.listing-item'):
    title = car.select_one('h3 > a').text
    price = car.select_one('.price > a').text
    type_ = car.select_one('.seller-type, .franchise-stock-type').get_text(strip=True)
    all_data.append( dict(title=title, price=price, type=type_, **{li['data-type']: li.text for li in car.select('li[data-type]')}) )

df = pd.DataFrame(all_data)
print(df)

df.to_csv('data.csv')

Prints:印刷:

                                                title       price                type    Odometer Body Style Transmission                  Engine           Build Date
0   2019 Nissan Pathfinder ST-L R52 Series III Aut...   $45,878*      Dealer Used Car    1,400 km        SUV    Automatic        6cyl 3.5L Petrol                  NaN
1   2020 Land Rover Range Rover Evoque D150 S Auto...   $70,000*   Private Seller Car    3,000 km        SUV    Automatic  4cyl 2.0L Turbo Diesel                  NaN
2                 2011 SsangYong Korando S Manual 2WD    $8,750*      Dealer Used Car  164,834 km        SUV       Manual  4cyl 2.0L Turbo Diesel                  NaN
3              2016 BMW X3 xDrive20d F25 LCI Auto 4x4   $31,000*   Private Seller Car   99,654 km        SUV    Automatic  4cyl 2.0L Turbo Diesel                  NaN
4       2019 Mitsubishi Outlander ES ZL Auto 2WD MY20    $29,580          Dealer Demo        2 km        SUV    Automatic        4cyl 2.4L Petrol                  NaN
5    2012 Mazda CX-5 Grand Touring KE Series Auto AWD   $18,000*   Private Seller Car  116,590 km        SUV    Automatic  4cyl 2.2L Turbo Diesel                  NaN
6                     2020 MG HS Excite Auto FWD MY20    $32,848     New Car In Stock         NaN        SUV    Automatic  4cyl 1.5L Turbo Petrol  Build date Jan 2020
7                  2019 BMW X3 xDrive30i G01 Auto 4x4   $67,800*      Dealer Used Car   10,637 km        SUV    Automatic  4cyl 2.0L Turbo Petrol                  NaN
8              2019 BMW X1 xDrive25i F48 LCI Auto AWD    $56,990      Dealer Used Car    7,203 km        SUV    Automatic  4cyl 2.0L Turbo Petrol                  NaN
9          2019 Jeep Cherokee Trailhawk Auto 4x4 MY19    $50,890          Dealer Demo       10 km        SUV    Automatic        6cyl 3.2L Petrol                  NaN
10              2019 Audi Q2 35 TFSI design Auto MY19    $44,850          Dealer Demo    2,135 km        SUV    Automatic  4cyl 1.4L Turbo Petrol                  NaN
11  2020 Land Rover Range Rover Sport SDV8 HSE Aut...  $162,500*   Private Seller Car       48 km        SUV    Automatic  8cyl 4.4L Turbo Diesel                  NaN
12      2015 Porsche Macan S Diesel 95B Auto AWD MY15   $59,800*      Dealer Used Car   71,926 km        SUV    Automatic  6cyl 3.0L Turbo Diesel                  NaN
13   2018 Mazda CX-5 Akera KF Series Auto i-ACTIV AWD   $39,990*      Dealer Used Car   14,855 km        SUV    Automatic        4cyl 2.5L Petrol                  NaN
14  2019 Mazda CX-5 Maxx Sport KF Series Auto i-AC...   $39,950*      Dealer Used Car    9,592 km        SUV    Automatic  4cyl 2.2L Turbo Diesel                  NaN
15            2019 Mitsubishi ASX LS XD Auto 2WD MY20    $29,685          Dealer Demo      447 km        SUV    Automatic        4cyl 2.0L Petrol                  NaN
16                2012 Audi Q5 TFSI Auto quattro MY12   $22,900*   Private Seller Car   69,518 km        SUV    Automatic  4cyl 2.0L Turbo Petrol                  NaN
17              2013 Subaru XV 2.0i G4X Auto AWD MY13   $16,990*      Dealer Used Car   94,245 km        SUV    Automatic        4cyl 2.0L Petrol                  NaN
18  2019 Mitsubishi Pajero Sport Exceed QF Auto 4x...    $58,880          Dealer Demo    1,755 km        SUV    Automatic  4cyl 2.4L Turbo Diesel                  NaN

And saves data.csv (screenshot from LibreOffice):并保存data.csv (来自 LibreOffice 的屏幕截图):

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM