我怎样才能刮掉精确的值/文本 <td> 在使用硒选择下拉列表后使用其xpath？

Question

I am trying to extract a specific value from a table for each of the five days that are shown in the drop-down menu. 我试图从下拉菜单中显示的五天中的每一天中提取表中的特定值。

I need to be able to get each days' settle value on a recurring basis(each week user will scrape for the five new prices). 我需要能够定期获得每天的结算价值（每周用户将获得五个新价格）。 Currently my script will only retrieve today's posted value in the table. 目前我的脚本只会检索表中的今天发布值。

I had similar issues when using lxml to extract the xpath, which is why I thought it must be a javascript thing, so I am trying out Selenium now. 当使用lxml来提取xpath时，我遇到了类似的问题，这就是为什么我认为它必须是一个javascript的东西，所以我现在正在尝试Selenium。 Any help or direction is appriciated. 任何帮助或方向都是适当的。

from selenium import webdriver

path_to_chromedriver = '/Users/Daniel/Desktop/chromedriver' 

driver = webdriver.Chrome(executable_path='C:\Users\Daniel\Desktop\chromedriver\chromedriver.exe')

url = 'http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html'
driver.get(url)

driver.find_element_by_xpath('//*[@id="cmeTradeDate"]/option[2]').click()

driver.implicitly_wait(10)

settle_price = driver.find_element_by_xpath('//*[@id="settlementsFuturesProductTable"]/tbody/tr[1]/td[6]').text

print settle_price

http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html

(Crude Oil Prices- drop down contains five days) Value needed from page is the settle price for May 16(xpath): （原油价格 - 下跌包含五天）页面所需的价值是5月16日的结算价格（xpath）：

 //*[@id="settlementsFuturesProductTable"]/tbody/tr[1]/td[6]

Why is the text output for todays page and not the drop down element that the browser goes to? 为什么今天页面的文本输出而不是浏览器的下拉元素？

Answer 1

Several things to introduce to solve it: 要解决它的几件事：

use the Select class and the .options property to get and iterate over the dropdown options 使用Select类和.options属性来获取并迭代下拉选项
once an option value is selected, the table is getting updated. 一旦选择了选项值，表格就会更新。 In order to catch when it finished updating, you would need to (well, there are different strategies, it's just one of them) explicitly wait for invisibility of the "Processing" spinner/loader element 为了捕获它何时完成更新，你需要（好吧，有不同的策略，它只是其中之一） 显式等待 “Processing”微调器/加载器元素的隐形

Implementation: 执行：

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.select import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)

url = 'http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html'
driver.get(url)

select = Select(driver.find_element_by_id("cmeTradeDate"))
for option in select.options:
    # selecting a value in the dropdown
    select.select_by_value(option.get_attribute("value"))

    # wait for the table to load
    wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR, ".cmeProgressPanel")))

    # get the desired price
    selected_price = driver.find_element_by_xpath('//*[@id="settlementsFuturesProductTable"]/tbody/tr[1]/td[6]')
    print(option.text, selected_price.text)

Prints: 打印：

(u'Friday, 15 Apr 2016 (Final)', u'40.36')
(u'Thursday, 14 Apr 2016 (Final)', u'41.50')
(u'Wednesday, 13 Apr 2016 (Final)', u'41.76')
(u'Tuesday, 12 Apr 2016 (Final)', u'42.17')
(u'Monday, 11 Apr 2016 (Final)', u'40.36')

我怎样才能刮掉精确的值/文本 <td> 在使用硒选择下拉列表后使用其xpath？

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-04-16 01:13:38

我怎样才能刮掉精确的值/文本 <td> 在使用硒选择下拉列表后使用其xpath？

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-04-16 01:13:38

解决方案1
0 已采纳 2016-04-16 01:13:38