使用 Selenium 在 python 中进行网页抓取问题

Question

I am trying to scrape using selenium in python.我正在尝试在 python 中使用 selenium 进行抓取。 I want the solar data from this site and section: https://www.caiso.com/TodaysOutlook/Pages/supply.html#section-renewables-trend我想要来自这个站点和部分的太阳能数据： https ://www.caiso.com/TodaysOutlook/Pages/supply.html#section-renewables-trend

I think the problem I'm having is that the Chart data (CSV) menu option does not function as a button so clicking it doesn't work.我认为我遇到的问题是Chart data (CSV)菜单选项不能用作按钮，因此单击它不起作用。 This is what I see when I inspect the element before and after clicking it the "Chart data (CSV)" menu option.这是我在单击“图表数据（CSV）”菜单选项之前和之后检查元素时看到的。

Before: <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv">Chart data (CSV)</a>之前： <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv">Chart data (CSV)</a>

After: <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv" href="data:text/csv;charset=utf8,Renewables%2007%2F20%2 ... [alot of encoded data] ...2C209%2C211%2C211%2C211%2C212%2C211%2C211%2C210%0A" download="CAISO-renewables-20220720.csv">Chart data (CSV)</a>之后： <a class="dropdown-item mb-0" id="downloadRenewablesCSV" data-type="text/csv" href="data:text/csv;charset=utf8,Renewables%2007%2F20%2 ... [alot of encoded data] ...2C209%2C211%2C211%2C211%2C212%2C211%2C211%2C210%0A" download="CAISO-renewables-20220720.csv">Chart data (CSV)</a>

originally I assumed it was just a button element that would download the csv file and was trying to do this:最初我认为它只是一个按钮元素，可以下载 csv 文件并尝试这样做：

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Chrome(executable_path='PATH')
driver.get('https://www.caiso.com/TodaysOutlook/Pages/supply.html')
button = driver.find_element(by='xpath',value='/html/body/div[1]/div[3]/div[8]/div/div/div[2]/nav/div[3]/div/a[1]')
button.click()

This isn't working.这是行不通的。 Any advice?有什么建议吗？ I am very new to selenium sorry.我对硒很陌生，抱歉。

Answer 1

JS Path Interaction: JS 路径交互：

Xpath selectors can be a bit finicky, I would revert to the basics and try to interact with the element via the JS Path . Xpath 选择器可能有点挑剔，我会回归基础并尝试通过 JS Path 与元素交互。 I was able to reproduce the error and download the report using the JS Path instead.我能够重现错误并使用 JS 路径下载报告。 Implement the following updated code:实现以下更新的代码：

driver.get('https://www.caiso.com/TodaysOutlook/Pages/supply.html')
driver.execute_script("el = document.querySelector('#downloadRenewablesCSV');el.click();")

Answer 2

You were trying to click on download button without actually expanding the drop down, the element becomes interactable upon clicking the dropdown.您试图在没有实际展开下拉列表的情况下单击下载按钮，该元素在单击下拉列表时变为可交互的。

The show class is added dynamically to the div only once the div is clicked.只有在单击 div 时，才会将show类动态添加到div中。

The below code should work after clicking on the dropdown button单击下拉按钮后，下面的代码应该可以工作

dropdown = driver.find_element(By.XPATH, "//button[@id='dropdownMenuRenewables']")
dropdown.click()
download_b = driver.find_element(By.XPATH, "//a[@id='downloadRenewablesCSV']")
download_b.click()

This will download the file for you这将为您下载文件

使用 Selenium 在 python 中进行网页抓取问题

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-07-20 17:56:04

JS Path Interaction: JS 路径交互：

解决方案2
1 2022-07-20 18:04:33

使用 Selenium 在 python 中进行网页抓取问题

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-07-20 17:56:04

JS Path Interaction: JS 路径交互：

解决方案2 1 2022-07-20 18:04:33

解决方案1
1 已采纳 2022-07-20 17:56:04

解决方案2
1 2022-07-20 18:04:33