简体   繁体   中英

How to scrape all the prices using Selenium and Python

I'm trying to get ticket prices from Viagogo with not luck. The scrip seems quite simple, and works with other websites but not for Viagogo. I have no issue in getting the text in the title from here.

It always return me an empty result( ie [] ). Can anyone help?

Code trials:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
import pandas as pd

s = Service("[]/Downloads/chromedriver/chromedriver.exe")
driver = webdriver.Chrome(service=s)
driver.get('https://www.viagogo.com/Concert-Tickets/Country-and-Folk/Taylor-Swift-Tickets/E-151214704')
price = driver.find_elements(by=By.XPATH, value('//div[@id="clientgridtable"]/div[2]/div/div/div[3]/div/div[1]/span[@class="t-b fs16"]'))
Print(price)
[]

I'am expecting the 7 prices defined in the right hand side of the website situated just above "per ticket"

There is an error in the definition of price , it should be value='...' instead of value('...') . Moreover, you should define it using a wait command so that the driver waits for the prices to be visible on the page.

price = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "...")))

Notice that this command needs the following imports

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

To extract all the 7 prices defined in the right hand side within the website you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies :

  • Using CSS_SELECTOR and text attribute:

     print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.f-list__cell-pricing-ticketstyle > div.w100 > span")))])
  • Using XPATH and get_attribute("innerHTML") :

     print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='f-list__cell-pricing-ticketstyle']/div[@class='w100 ']/span")))])
  • Console Output:

     ['Rs.25,509', 'Rs.25,873', 'Rs.27,403', 'Rs.28,788', 'Rs.72,809', 'Rs.65,593', 'Rs.29,153', 'Rs.29,153']
  • Note : You have to add the following imports:

     from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM