简体   繁体   中英

How to scrape data from a dynamic website with Selenium

I am new to selenium and want to scrape price and offer end time from a Udemy Course link. How can i do this?

The price and course end time is dynamically loaded to the website. I know how to extract simple content from the website but not dynamic content.

I have tried with Parsel Library + Seleminium Library but returns empty string. Because when i view-source the website in my mobile there is not price shown in source. But when i click on inspect element option of chrome or firefox. The price is provided inside a span tag. Means that when the page is rendered on browser the price is dynamically loaded. How can i do this in Selenium?

Here is an example Udemy Course link:

https://www.udemy.com/course/data-science-deep-learning-in-python/

With all dependencies already installed in your environment, this code should work:

    from selenium import webdriver
    from bs4 import BeautifulSoup
    from webdriver_manager.chrome import ChromeDriverManager

    driver = webdriver.Chrome(ChromeDriverManager().install())
    driver.get("https://www.udemy.com/course/appium-selenium-for-mobile-automation-testing/")

    content = driver.page_source

    soup = BeautifulSoup(content, 'html.parser')

    price = soup.find('div', {'class':'price-text--price-part--Tu6MH udlite-clp-discount-price udlite-heading-xl'})
    if price is not None:
        price.text.strip()
        price = price.replace('Current price','')
        print('Price: ' + price)
        
        offerEndTime = soup.find('span', {'data-purpose':'safely-set-inner-html:discount-expiration:expiration-text'}).text.strip()
        print('Offer end time: ' +  offerEndTime)
    else:
        print('This is a free course')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM