[英]I have a problem when web scraping by bs4
我編寫代碼來抓取汽車信息,例如 - title, make, model, transmission, year and price
來自ebay.com
的傳輸、年份和價格數據,一切正常,但很少有'transmission'
部分轉換為與傳輸具有相同地址的'options'
這有時會導致代碼不起作用。
我只想要自動或手動傳輸,我嘗試了一些'if'
來解決這個問題,但它沒有用。
我的代碼:
import requests
from bs4 import BeautifulSoup
import re
url = 'https://www.ebay.com/b/Cars-Trucks/6001?_fsrp=0&_sacat=6001&LH_BIN=1&LH_ItemCondition=3000%7C1000%7C2500&rt=nc&_stpos=95125&Model%2520Year=2020%7C2019%7C2018%7C2017%7C2016%7C2015'
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
ebay_cars = soup.find_all('li', class_='s-item')
for car_info in ebay_cars:
title_div = car_info.find('div', class_='s-item__wrapper clearfix')
title_sub_div = title_div.find('div', class_='s-item__info clearfix')
title_p = title_sub_div.find('span', class_='s-item__price')
title_tag = title_sub_div.find('a', class_='s-item__link')
title_maker = title_sub_div.find('span', class_='s-item__dynamic s-
item__dynamicAttributes1')
title_model = title_sub_div.find('span', class_='s-item__dynamic s-
item__dynamicAttributes2')
title_trans = title_sub_div.find('span', class_='s-item__dynamic s-
item__dynamicAttributes3')
name_of_car = re.sub(r'\d{4}', '', title_tag.text)
maker_of_car = re.sub(r'Make: ','', title_maker.text)
model_of_car = re.sub(r'Model: ', '', title_model.text)
try:
trans_of_car = re.sub(r'Transmission: ', '', title_trans.text)
except:
trans_of_car = ''
year_of_car = re.findall(r'\d{4}', title_tag.text)
year_of_car = ''.join(str(x) for x in year_of_car)
price_of_car = title_p.text
print(trans_of_car )
Output:
Automatic
Manual
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Automatic
Options: 4-Wheel Drive
'Options: 4-Wheel Drive'
是我的問題。
更新了您的 try-except 塊:
try:
if title_trans.text.startswith(r'Transmission: '):
trans_of_car = re.sub(r'Transmission: ', '', title_trans.text)
else:
trans_of_car = ''
except AttributeError:
trans_of_car = ''
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.