使用 selenium 抓取时未呈现动态内容

Question

我正在尝试使用 selenium 进行抓取（用于在 python 3.7 中工作的脚本）。

上周我不得不重置我的电脑并安装了最新版本的 python 和脚本中使用的所有包。

我观察到的是，没有一个动态值被渲染并使用 header 标签显示。 请参阅下面的一些输出：

<tr>
<td class="textsr">Close</td>
<td class="textvalue">{{ScripHeaderData.Header.Close}}</td>
</tr>

<tr>
<td class="textsr">WAP</td>
<td class="textvalue">{{StkTrd.WAP}}</td>
</tr>

<tr>
<td class="textsr">Big Value</td>
<td class="textvalue">{{checknullheader(CompData.BigVal)?'-':(CompData.BigVal)}}</td>
</tr>

我一直在将脚本用于我的研究目的，并且需要它恢复原状，因此感谢任何指导。

这是供参考的片段：

target_url = q.get(timeout=1)
time.sleep(1)
driver = webdriver.Chrome('./chromedriver',options=opts)
driver.get(target_url)
# this is just to ensure that the page is loaded
time.sleep(5)
    
html_content = driver.page_source
    
soup = BeautifulSoup(html_content, features="html.parser")
    
table_rows = soup.find_all('tr')
for row in table_rows:
    table_cols = row.find_all('td')
    for col in table_cols:
        label_value = col.text

Answer 1

虽然使用time.sleep等待页面加载可能很诱人，但最好使用 Selenium 等待条件等待，可能与您想要的元素有关。 https://www.selenium.dev/documentation/webdriver/waits/

这是另一个关于等待和条件与 time.sleep 的很好答案的线程： How to sleep Selenium WebDriver in Python for milliseconds

Answer 2

我参考了很多论坛并尝试了很多建议（等待、驱动程序选项、更改 web 驱动程序、切换内容等），但是我的问题似乎更具体并且没有得到解决。

最终回到我的旧设置（运行 python 3.9.6），然后它又回到了工作 state。

感谢 Joe Carboni 在这方面的时间和投入。

我找不到问题的根本原因以及解决它的解决方法，这有点令人沮丧。 但是，如果它对某人有帮助，只是发布我在这里所做的事情，干杯。

使用 selenium 抓取时未呈现动态内容

问题描述

2 个解决方案

解决方案1
0 2022-08-03 15:06:43

解决方案2
0 2022-08-07 14:50:09

使用 selenium 抓取时未呈现动态内容

问题描述

2 个解决方案

解决方案1 0 2022-08-03 15:06:43

解决方案2 0 2022-08-07 14:50:09

解决方案1
0 2022-08-03 15:06:43

解决方案2
0 2022-08-07 14:50:09