简体   繁体   English

使用Selenium和Python从下拉菜单中选择

[英]Select from dropdown using Selenium and Python

With the help of Selenium and Python . SeleniumPython的帮助下。 I wanted to crawl a webpage which has a nested drop-down menu. 我想抓取一个具有嵌套下拉菜单的网页。 I am posting only the nested section below: 我只在下面发布嵌套部分:

<div class="dropDown active" data-dropdown-block="FOOTBALL_COMPSEASON" data-dropdown-default="All Seasons">
    <div class="label" id="dd-FOOTBALL_COMPSEASON">Filter by Season</div> 
    <div class="current" data-dropdown-current="FOOTBALL_COMPSEASON" role="button" tabindex="0" aria-expanded="false" aria-labelledby="dd-FOOTBALL_COMPSEASON" data-listen-keypress="true" data-listen-click="true">
        2018/19
    </div>
    <ul class="dropdownList" data-dropdown-list="FOOTBALL_COMPSEASON" role="listbox" aria-labelledby="dd-FOOTBALL_COMPSEASON" data-listen-keypress="true" data-listen-click="true">
        <li role="option" tabindex="0" data-option-name="All Seasons" data-option-id="-1" data-option-index="-1">
             All Seasons
        </li> 
        <li role="option" tabindex="0" data-option-name="2018/19" data-option-id="210" data-option-index="0">
            2018/19
         </li>
         <li role="option" tabindex="0" data-option-name="2017/18" data-option-id="79" data-option-index="1">
              2017/18
         </li>
         <li role="option" tabindex="0" data-option-name="2016/17" data-option-id="54" data-option-index="2">
             2016/17
         </li>
    </ul>
</div>

Here is the screenshot of how it looks: 这是它的外观截图:

So, I wanted to make the crawler click the drop-down and select 2017/18. 因此,我想使搜寻器单击下拉列表并选择2017/18。

I first tried this: 我首先尝试了这个:

driver.get(_url)
select_element = driver.find_elements_by_class_name("dropdownList")[1]

As the class dropdownList is used multiple times in the HTML and my desired element is in the second position, ie <ul class="dropdownList".... is the second time the class dropdown is used, so I used [1] to get the second child. 由于类dropdownList在HTML中多次使用,而我所需的元素位于第二个位置,即<ul class="dropdownList"....是第二次使用类dropdown ,因此我使用了[1]得到第二个孩子。

But then I get this error: 但是然后我得到这个错误:

File "shots_2017_18.py", line 15, in shots_2017_18 select_element = driver.find_elements_by_class_name("dropdownList") 1 IndexError: list index out of range 在shots_2017_18中的文件“ shots_2017_18.py”,第15行,select_element = driver.find_elements_by_class_name(“ dropdownList”) 1 IndexError:列表索引超出范围

What should I change or do so that the crawler can select the 2017/18 item from the dropdown list and can crawl? 我应该更改或执行哪些操作,以便爬网程序可以从下拉列表中选择2017/18项目并进行爬网?

If you are able to click on drop down , using python and selenium . 如果您能够使用pythonselenium单击下拉列表。 Then you can try this code : 然后,您可以尝试以下代码:

UPDATE : 更新:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.action_chains import ActionChains
import time


driver   = webdriver.Chrome(executable_path = r'C:/Users/user***/Downloads/chromedriver_win32/chromedriver.exe')
driver.maximize_window()

wait = WebDriverWait(driver,40)

driver.get("https://www.premierleague.com/stats/top/players/goals")  

wait.until(EC.visibility_of_element_located((By.ID, 'dd-FOOTBALL_COMPSEASON')))

time.sleep(5)
drop_down_click = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.current[data-dropdown-current='FOOTBALL_COMPSEASON']")))
drop_down_click.click()

options = driver.find_elements_by_css_selector("ul[data-dropdown-list='FOOTBALL_COMPSEASON'] li")

for option in options:
  if "2017/18" in option.text.strip():
    option.click()  

UPDATE1 : UPDATE1:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.action_chains import ActionChains
import time

driver   = webdriver.Chrome(executable_path = r'C:/Users/user***/Downloads/chromedriver_win32/chromedriver.exe')
driver.maximize_window()

wait = WebDriverWait(driver,40)

driver.get("https://www.premierleague.com/stats/top/players/total_scoring_att")


cookie_button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.btn-primary.cookies-notice-accept")))
ActionChains(driver).move_to_element(cookie_button)
driver.execute_script('arguments[0].click();', cookie_button)
wait.until(EC.visibility_of_element_located((By.ID, 'dd-FOOTBALL_COMPSEASON')))

time.sleep(5)
drop_down_click = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div.current[data-dropdown-current='FOOTBALL_COMPSEASON']")))
drop_down_click.click()

options = driver.find_elements_by_css_selector("ul[data-dropdown-list='FOOTBALL_COMPSEASON'] li")

for option in options:
  if "2017/18" in option.text.strip():
    option.click()  

Explanation : 说明

An explicit wait is code you define to wait for a certain condition to occur before proceeding further in the code. 显式等待是您定义的代码,用于等待特定条件发生后再继续执行代码。 The worst case of this is Thread.sleep(), which sets the condition to an exact time period to wait. 最糟糕的情况是Thread.sleep(),它将条件设置为要等待的确切时间段。 There are some convenience methods provided that help you write code that will wait only as long as required. 提供了一些方便的方法,可以帮助您编写仅等待所需时间的代码。 WebDriverWait in combination with ExpectedCondition is one way this can be accomplished. WebDriverWait与ExpectedCondition结合是实现此目的的一种方法。

More about explicit wait, can be found here 关于显式等待的更多信息,可以在这里找到

when you get index out of range in this situation it means elements found are None or just one, since your written code is correct i think you entered wrong URL. 当您在这种情况下超出索引范围时,这意味着找到的元素为“无”或“仅一个”,因为您编写的代码正确,我认为您输入了错误的URL。 but if URL is correct so you can use XPATH to find Your appropriate element. 但是如果URL正确,则可以使用XPATH查找适当的元素。 try this code: 试试这个代码:

select_element = driver.find_element_by_xpath("//li[@data-option-name='2017/18']")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM