简体   繁体   English

试图抓取 Airbnb 数据

[英]Trying to scrape Airbnb data

So I'm trying to scrape some data from Airbnb (name, price, rating), I can print out variables such as price,name and rating but I want to put them in a dictionary.所以我试图从 Airbnb 中抓取一些数据(名称、价格、评级),我可以打印出价格、名称和评级等变量,但我想将它们放入字典中。 What am I missing?我错过了什么?

import requests
from bs4 import BeautifulSoup

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'
}

url = 'https://www.airbnb.com/s/Tbilisi--Georgia/homes?tab_id=home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=november&flexible_trip_dates%5B%5D=october&flexible_trip_lengths%5B%5D=weekend_trip&date_picker_type=calendar&query=Tbilisi%2C%20Georgia&place_id=ChIJa2JP5tcMREARo25X4u2E0GE&source=structured_search_input_header&search_type=autocomplete_click'

response = requests.get(url, headers=headers)

soup = BeautifulSoup(response.content, 'lxml')



for item in soup.find_all('div', itemprop='itemListElement'):

    try:
        price = item.find('span', class_='_krjbj').text
        rating = item.find('span', class_='_18khxk1').text
        name = item.find('meta', itemprop='name')['content']
    except Exception as e:
        house_list = {
            'price': price,
            'rating': rating,
            'name': name,
        }
        print(house_list)

The way you've written it, you'll only print the house_dict dictionary if you run into an exception in the try block (which wouldn't work anyway - hitting an exception inside the try block means that one of the variables you're trying to put inside house_dict won't be defined, which will raise a NameError in the except block).按照你编写的方式,如果你在try块中遇到异常,你只会打印house_dict字典(这无论如何都行不通 - 在try块内遇到异常意味着你正在使用的变量之一试图将house_dict放入内部将不会被定义,这将在except块中引发NameError )。

You probably want to do something like this instead:你可能想做这样的事情:

# ...
    try:
        price = item.find('span', class_='_krjbj').text
        rating = item.find('span', class_='_18khxk1').text
        name = item.find('meta', itemprop='name')['content']
    except Exception as e:
        print("Ran into an Exception when trying to parse data")
        continue
    else:
        house_list = {
            'price': price,
            'rating': rating,
            'name': name,
        }
        print(house_list)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM