简体   繁体   English

使用 Selenium 在 python 中进行网页抓取问题

[英]Webscraping question in python using Selenium

I am trying to scrape using selenium in python.我正在尝试在 python 中使用 selenium 进行抓取。 I want the solar data from this site and section: https://www.caiso.com/TodaysOutlook/Pages/supply.html#section-renewables-trend我想要来自这个站点和部分的太阳能数据: https ://www.caiso.com/TodaysOutlook/Pages/supply.html#section-renewables-trend 在此处输入图像描述

I think the problem I'm having is that the Chart data (CSV) menu option does not function as a button so clicking it doesn't work.我认为我遇到的问题是Chart data (CSV)菜单选项不能用作按钮,因此单击它不起作用。 This is what I see when I inspect the element before and after clicking it the "Chart data (CSV)" menu option.这是我在单击“图表数据(CSV)”菜单选项之前和之后检查元素时看到的。

Before: <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv">Chart data (CSV)</a>之前: <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv">Chart data (CSV)</a>

After: <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv" href="data:text/csv;charset=utf8,Renewables%2007%2F20%2 ... [alot of encoded data] ...2C209%2C211%2C211%2C211%2C212%2C211%2C211%2C210%0A" download="CAISO-renewables-20220720.csv">Chart data (CSV)</a>之后: <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv" href="data:text/csv;charset=utf8,Renewables%2007%2F20%2 ... [alot of encoded data] ...2C209%2C211%2C211%2C211%2C212%2C211%2C211%2C210%0A" download="CAISO-renewables-20220720.csv">Chart data (CSV)</a>

originally I assumed it was just a button element that would download the csv file and was trying to do this:最初我认为它只是一个按钮元素,可以下载 csv 文件并尝试这样做:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome(executable_path='PATH')
driver.get('https://www.caiso.com/TodaysOutlook/Pages/supply.html')
button = driver.find_element(by='xpath',value='/html/body/div[1]/div[3]/div[8]/div/div/div[2]/nav/div[3]/div/a[1]')
button.click()

This isn't working.这是行不通的。 Any advice?有什么建议吗? I am very new to selenium sorry.我对硒很陌生,抱歉。

JS Path Interaction: JS 路径交互:

Xpath selectors can be a bit finicky, I would revert to the basics and try to interact with the element via the JS Path . Xpath 选择器可能有点挑剔,我会回归基础并尝试通过 JS Path 与元素交互 I was able to reproduce the error and download the report using the JS Path instead.我能够重现错误并使用 JS 路径下载报告。 Implement the following updated code:实现以下更新的代码:

driver.get('https://www.caiso.com/TodaysOutlook/Pages/supply.html')
driver.execute_script("el = document.querySelector('#downloadRenewablesCSV');el.click();")

You were trying to click on download button without actually expanding the drop down, the element becomes interactable upon clicking the dropdown.您试图在没有实际展开下拉列表的情况下单击下载按钮,该元素在单击下拉列表时变为可交互的。

The show class is added dynamically to the div only once the div is clicked.只有在单击 div 时,才会将show类动态添加到div中。

The below code should work after clicking on the dropdown button单击下拉按钮后,下面的代码应该可以工作

dropdown = driver.find_element(By.XPATH, "//button[@id='dropdownMenuRenewables']")
dropdown.click()
download_b = driver.find_element(By.XPATH, "//a[@id='downloadRenewablesCSV']")
download_b.click()

This will download the file for you这将为您下载文件

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM