简体   繁体   中英

Python web scrape selenium / requests

I am trying to scrape results from this page:

https://results.chronotrack.com/event/results/event/event-24381

This is what I can do manually:

1) Open above URL in chrome

2) Click on the Results tab

2.5) Sometimes change the race distance

3) Click Next to get to page 2

4) Open developer tools / network

5) Click previous to get back to the first page

6) Get the Request URL from the results-grid?callback element from developer tools:

https://results.chronotrack.com/embed/results/results-grid?callback=results_grid17740402&sEcho=7&iColumns=11&sColumns=&iDisplayStart=0&iDisplayLength=100&mDataProp_0=0&mDataProp_1=1&mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&mDataProp_5=5&mDataProp_6=6&mDataProp_7=7&mDataProp_8=8&mDataProp_9=9&mDataProp_10=10&raceID=60107&bracketID=638654&intervalID=121077&entryID=&eventID=24381&eventTag=event-24381&oemID=www.chronotrack.com&genID=17740402&x=1507682443198&_=1507682443198

Once I get this far I can manipulate the DisplayStart parameter to get the rest of the results.

Is there a way for me to find that URL using requests and/or selenium? With Selenium I tried opening the first page then clicking over to results with the following:

driver.find_element_by_id('resultsResultsTab').click()

But I get the following error:

Element is not currently visible and may not be manipulated

Can anyone get me pointed in the right direction?

Try to wait until required element becomes visible:

from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

wait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'resultsResultsTab'))).click()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM