[英]How can I scrape the value/text of an exact <td> using its xpath, after selecting a drop-down using selenium?
I am trying to extract a specific value from a table for each of the five days that are shown in the drop-down menu. 我试图从下拉菜单中显示的五天中的每一天中提取表中的特定值。
I need to be able to get each days' settle value on a recurring basis(each week user will scrape for the five new prices). 我需要能够定期获得每天的结算价值(每周用户将获得五个新价格)。 Currently my script will only retrieve today's posted value in the table.
目前我的脚本只会检索表中的今天发布值。
I had similar issues when using lxml to extract the xpath, which is why I thought it must be a javascript thing, so I am trying out Selenium now. 当使用lxml来提取xpath时,我遇到了类似的问题,这就是为什么我认为它必须是一个javascript的东西,所以我现在正在尝试Selenium。 Any help or direction is appriciated.
任何帮助或方向都是适当的。
from selenium import webdriver
path_to_chromedriver = '/Users/Daniel/Desktop/chromedriver'
driver = webdriver.Chrome(executable_path='C:\Users\Daniel\Desktop\chromedriver\chromedriver.exe')
url = 'http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html'
driver.get(url)
driver.find_element_by_xpath('//*[@id="cmeTradeDate"]/option[2]').click()
driver.implicitly_wait(10)
settle_price = driver.find_element_by_xpath('//*[@id="settlementsFuturesProductTable"]/tbody/tr[1]/td[6]').text
print settle_price
http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html
(Crude Oil Prices- drop down contains five days) Value needed from page is the settle price for May 16(xpath): (原油价格 - 下跌包含五天)页面所需的价值是5月16日的结算价格(xpath):
//*[@id="settlementsFuturesProductTable"]/tbody/tr[1]/td[6]
Why is the text output for todays page and not the drop down element that the browser goes to? 为什么今天页面的文本输出而不是浏览器的下拉元素?
Several things to introduce to solve it: 要解决它的几件事:
Select
class and the .options
property to get and iterate over the dropdown options Select
类和.options
属性来获取并迭代下拉选项 Implementation: 执行:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.select import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
url = 'http://www.cmegroup.com/trading/energy/crude-oil/light-sweet-crude_quotes_settlements_futures.html'
driver.get(url)
select = Select(driver.find_element_by_id("cmeTradeDate"))
for option in select.options:
# selecting a value in the dropdown
select.select_by_value(option.get_attribute("value"))
# wait for the table to load
wait.until(EC.invisibility_of_element_located((By.CSS_SELECTOR, ".cmeProgressPanel")))
# get the desired price
selected_price = driver.find_element_by_xpath('//*[@id="settlementsFuturesProductTable"]/tbody/tr[1]/td[6]')
print(option.text, selected_price.text)
Prints: 打印:
(u'Friday, 15 Apr 2016 (Final)', u'40.36')
(u'Thursday, 14 Apr 2016 (Final)', u'41.50')
(u'Wednesday, 13 Apr 2016 (Final)', u'41.76')
(u'Tuesday, 12 Apr 2016 (Final)', u'42.17')
(u'Monday, 11 Apr 2016 (Final)', u'40.36')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.