简体   繁体   中英

Parse Airdna map hover over text using Selenium & Beautifulsoup

I'm trying to scrape data from a window which appear when hovering over a marker in map view and scrape the "Days Available" value from the window.

image of text which I am trying to scrape

I'm struggling to hover over all all the purple markers one by one in map view using python, webdriver and BeautifulSoup. I managed to write the below code but mapMarkers variable is always blank.

from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.airdna.co/vacation-rental-data/app/us/california/santa-monica/overview")

                                                                                                        
input("Press Enter to continue...")  # wait until page loads and tutorial is closed

mapMarkers = driver.find_elements_by_class_name("Page__RightColumn-sc-291lxm-3")  # get a list of marker element

target_list = []

for i in range(len(mapMarkers)):
    mapMarkers[i].click() # click to appear hover over window
    html = driver.page_source
    soup = BeautifulSoup(html, "lxml")

    days = soup.find_all("p", {"class": ['info-window__statistics-value']})
    link   = soup.find_all("a", {"class": ['info-window__property-link']})
    target_list.append( { 
        days[0].text.replace('\n', '').replace(' ', ''), 
        link[0].attrs['href'] 
    } )


driver.quit()

This is the link to the website.

Some sites use private API to fetch their data, And your site is one of them To get API data you need to inspect.network activity .

right-click on the page and click Inspect to open DevTools. Go to Network tap and search for the API then click preview to see the content.

在此处输入图像描述

Right-click and then copy curl and then translate the command into python using this site

在此处输入图像描述

Your code will be like following:

import requests

headers = {
    'authority': 'api.airdna.co',
    'sec-ch-ua': '"Chromium";v="94", "Google Chrome";v="94", ";Not A Brand";v="99"',
    'sec-ch-ua-mobile': '?0',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36',
    'sec-ch-ua-platform': '"Windows"',
    'accept': '*/*',
    'origin': 'https://www.airdna.co',
    'sec-fetch-site': 'same-site',
    'sec-fetch-mode': 'cors',
    'sec-fetch-dest': 'empty',
    'referer': 'https://www.airdna.co/',
    'accept-language': 'en-US,en;q=0.9',
}

params = (
    ('access_token', 'MjkxMTI|8b0178bf0e564cbf96fc75b8518a5375'),
    ('city_id', '59053'),
    ('start_month', '10'),
    ('start_year', '2018'),
    ('number_of_months', '36'),
    ('currency', 'native'),
    ('show_regions', 'true'),
)

response = requests.get('https://api.airdna.co/v1/market/property_list', headers=headers, params=params)

results = response.json()["properties"]

for result in results[0:20]:
    title = result["title"]
    days_available = result["days_available"]
    print (f"{title} : {days_available}")
 

Result:

Panoramic Ocean View Studio Loft : 274
Private 1906 Bungalow : 364
Serene Garden Room by the Beach!!! : 188
Bright New Beachside Master Suite : 171
Bright New Beachside Bedroom : 164
Pvt bedroom-pvt bath & entryway. Ocean front Views : 155
Elegant Design Apartment with Courtyard Garden Dining Space : 224
Liz''s Beachy Retreat in Santa Monica! : 55
Santa Monica One BedRoom Apt.(Ocean Breeze B) : 26
ROOM & BATH. 4 BLOCKS TO OCEAN. N OF WILSHIRE. : 114
Comfy Room - Amazing Location! : 178
Steps to Beach in Gorgeous Suite! : 84
Stunning Three Bedroom Santa Monica Beach Home : 224
Santa Monica with parking/Montana close to beach : 156
PRIVATE ROOM W/BR IN SANTA MONICA : 293
Private Room with Bathroom at Beach :)Just Perfect : 334
Newly Furnished! 1 Bed Beach Condo : 45
Santa Monica Beach House!Prime area : 264
Santa Monica Canyon Pied-a-Terre : 355
Santa Monica Beach Suite 5 : 276

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM