繁体   English   中英

Selenium - 如何单击每个项目的所有“更多”按钮以从下拉列表中抓取数据

[英]Selenium - How to to click all the More buttons of each individual items to scrape the data from the dropdown

我正在尝试抓取页面上的信息,但是当我打开exported.CSV 文件时,除了标题之外它是空白的。

我正在尝试在此页面上抓取 10 个结果: https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city= &state=&x半径=

我可以抓取名称、公司、城市和 state,但是当单击“更多”下拉菜单时,它似乎不起作用。 (没有出现任何错误,csv 只是空白。)

我怀疑问题出在这个代码块上:

driver.find_element_by_xpath('//div[@class="col-md-4 col-lg-1 arrow"]').click()

这是我的所有代码:

options = Options()
options.headless = True
driver = webdriver.Chrome(executable_path='/Users/vilje/anaconda3/envs/webscrape/chromedriver', options=options)
driver.set_window_size(1440, 900)

# Creates master dataframe
df = pd.DataFrame(columns=['Name','Company', 'City', 'State', 'Phone', 'About']) 

# URL
driver.get('https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=')


name = driver.find_elements_by_xpath('//span[@class="name"]')
company = driver.find_elements_by_xpath('//div[@class="col-md-6 col-lg-4"]')
city = driver.find_elements_by_xpath('//div[@class="col-md-4 col-lg-2"]')
state = driver.find_elements_by_xpath('//div[@class="col-md-4 col-lg-2"]')

# Expand the 'More' button
driver.find_element_by_xpath('//div[@class="col-md-4 col-lg-1 arrow"]').click()
phone = driver.find_elements_by_xpath('//div[@class="col-sm-6 col-lg-3 with-icon lighter-text"]')
about = driver.find_elements_by_xpath('//div[@class="col-sm-12"]')

name_list = []
for n in range(len(name)):
    name_list.append(name[n].text)

company_list = []
for c in range(len(company)):
    company_list.append(company[c].text)

city_list = []
for c in range(len(city)):
    city_list.append(city[c].text)

state_list = []
for s in range(len(state)):
    state_list.append(state[s].text)

phone_list = []
for p in range(len(phone)):
    phone_list.append(phone[p].text)

about_list = []
for a in range(len(about)):
    about_list.append(about[a].text)

# List of each property managers name, company, city, state, phone and about section paired together
data_tuples = list(zip(name_list[0:], company_list[0:], city_list[0:], state_list[0:], phone_list[0:], about_list[0:]))

# Creates dataframe of each tuple in list
temp_df = pd.DataFrame(data_tuples, columns=['Name','Company', 'City', 'State', 'Phone', 'About']) 

# Appends to master dataframe
df = df.append(temp_df)

driver.close()

更多按钮。

谁能帮我点击每个人的所有“更多”按钮,这样我就可以从下拉列表中抓取数据。

要单击所有文本为More的元素,您需要为element_to_be_clickable()引入WebDriverWait ,并且可以使用以下任一定位器策略

  • 使用CSS_SELECTOR

     driver.get("https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=") for more in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.row div.arrow"))): more.click()
  • 使用XPATH

     driver.get("https://www.narpm.org/find/property-managers/?submitted=true&toresults=1&resultsperpage=10&a=managers&orderby=&fname=&lname=&company=&chapter=S005&city=&state=&xRadius=") for more in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='row']//div[contains(@class, 'arrow') and contains(., 'More')]"))): more.click()
  • 注意:您必须添加以下导入:

     from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
  • 浏览器快照:

MoreButtonsClicked

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM