簡體   English   中英

使用Selenium Webdriver遍歷URL

[英]Loop through url with Selenium Webdriver

以下request查找當天的比賽ID。 我正在嘗試將該str傳遞到driver.get url以便將其轉到每個比賽url並下載每個CSV比賽。 我想您必須編寫一個loop但是我不確定使用webdriver會是什么樣。

import time
from selenium import webdriver
import requests
import datetime

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA') 
data = req.json()

for ids in data:
    contest = ids['id']

driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!


driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '') 

請按以下順序嘗試:

import time
from selenium import webdriver
import requests
import datetime

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA')
data = req.json()



driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby')
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('Pr0c3ss')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('generic1!')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!

for ids in data:
    contest = ids['id']
    driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '')

您無需發送負載硒x次,即可下載x個文件。 請求和硒可以共享Cookie。 這意味着您可以使用Selenium登錄站點,檢索登錄詳細信息並與請求或任何其他應用程序共享。 花點時間檢查一下httpie, https: //httpie.org/doc#sessions,看來您像請求一樣手動控制了會話。

有關請求,請訪問: http : //docs.python-requests.org/en/master/user/advanced/?highlight=sessions對於硒,請訪問: http : //selenium-python.readthedocs.io/navigating.html#餅干

查看Webdriver塊,您可以添加代理並無頭或實時加載瀏覽器:只需注釋無頭行,它就應實時加載瀏覽器,這使調試變得容易,容易理解,並且對站點api / html的移動和更改也是如此。

import time
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
import requests
import datetime
import shutil



LOGIN = 'https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby'
BASE_URL = 'https://www.draftkings.com/contest/exportfullstandingscsv/'
USER = ''
PASS = ''

try:
    data = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA').json()
except BaseException as e:
    print(e)
    exit()


ids = [str(item['id']) for item in data]

# Webdriver block
driver = webdriver.Chrome()
options.add_argument('headless')
options.add_argument('window-size=800x600')
# options.add_argument('--proxy-server= IP:PORT')
# options.add_argument('--user-agent=' + USER_AGENT)

try:
    driver.get(URL)
    driver.implicitly_wait(2)
except WebDriverException:
    exit()

def login(USER, PASS)
    '''
    Login to draftkings.
    Retrieve authentication/authorization.

    http://selenium-python.readthedocs.io/waits.html#implicit-waits
    http://selenium-python.readthedocs.io/api.html#module-selenium.common.exceptions

    '''

    search_box = driver.find_element_by_name('username')
    search_box.send_keys(USER)

    search_box2 = driver.find_element_by_name('password')
    search_box2.send_keys(PASS)

    submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
    submit_button.click()

    driver.implicitly_wait(2)

    cookies = driver.get_cookies()
    return cookies


site_cookies = login(USER, PASS)

def get_csv_files(id):
    '''
    get each id and download the file.
    '''

    session = rq.session()

    for cookie in site_cookies:
        session.cookies.update(cookies)

    try:
        _data = session.get(BASE_URL + id)
        with open(id + '.csv', 'wb') as f:
            shutil.copyfileobj(data.raw, f)
    except BaseException:
        return


map(get_csv_files, ids)

這會有所幫助嗎

for ids in data:
    contest = ids['id']
    driver.get('https://www.draftkings.com/contest/exportfullstandingscsv/' + str(contest) + '') 

可能是時候分解一下了。
創建一些隔離的函數,這些函數是:
0。(可選)提供對目標URL的授權。
1.收集所有需要的id (代碼的第一部分)。
2.導出CSV以獲取特定id (代碼的第二部分)。
3.遍歷id列表,並為每個id調用func#2。

共享chromedriver作為每個參數的輸入參數,以保存驅動程序狀態和auth-cookie。
它的工作正常,使代碼清晰易讀。

我認為您可以將比賽的URL設置為登錄頁面中的a元素,然后單擊它。 然后使用其他ID重復該步驟。

請參閱下面的代碼。

req = requests.get('https://www.draftkings.com/lobby/getlivecontests?sport=NBA') 
data = req.json()
contests = []

for ids in data:
    contests.append(ids['id'])

driver = webdriver.Chrome()  # Optional argument, if not specified will search path.
driver.get('https://www.draftkings.com/account/sitelogin/false?returnurl=%2Flobby');
time.sleep(2) # Let DK Load!

search_box = driver.find_element_by_name('username')
search_box.send_keys('username')
search_box2 = driver.find_element_by_name('password')
search_box2.send_keys('password')
submit_button = driver.find_element_by_xpath('//*[@id="react-mobile-home"]/section/section[2]/div[3]/button/span')
submit_button.click()
time.sleep(2) # Let Page Load, If not it will go to Account!

for id in contests:
    element = driver.find_element_by_css_selector('a')
    script1 = "arguments[0].setAttribute('download',arguments[1]);"
    driver.execute_script(script1, element, str(id) + '.pdf')
    script2 = "arguments[0].setAttribute('href',arguments[1]);"
    driver.execute_script(script2, element, 'https://www.draftkings.com/contest/exportfullstandingscsv/' + str(id))
    time.sleep(1)
    element.click()
    time.sleep(3)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM