简体   繁体   English

如何在Python中使用Selenium从具有隐藏元素的动态折叠表中提取数据

[英]How to extract data from dynamic collapsing table with hidden elements using Selenium in Python

I try to scrape these 20 classifications from https://patents.google.com/patent/JP2009517369A/en?oq=JP2009517369 , from which the first is displayed and the others are hidden in an expandable section. 我尝试从https://patents.google.com/patent/JP2009517369A/en?oq=JP2009517369抓取这20个分类,从中显示第一个,其他隐藏在可扩展的部分中。

I already tried to get the first visible one with 我已经尝试过用

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree' and not(@hidden)]/state-modifier[@class='code style-scope classification-tree']/a[@class='style-scope state-modifier']"))).get_attribute("innerHTML") 

However, it raises an exception and I don't know why. 但是,这引发了一个异常,我不知道为什么。 So I figured that scraping the whole table would be easier but most of the elements are folded. 因此,我认为刮刮整个表格会更容易,但是大多数元素都折叠了。

Is there any approach on how to scrape dynamic hidden tables? 有什么方法可以抓取动态隐藏表吗? Thank you for your help! 谢谢您的帮助!

The First two options should print the value C07C311/51 前两个选项应打印值C07C311/51

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree' and not(@hidden)]/state-modifier[@class='code style-scope classification-tree']/a[@class='style-scope state-modifier']"))).text)

OR 要么

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree' and not(@hidden)]/state-modifier[@class='code style-scope classification-tree']/a[@class='style-scope state-modifier']"))).get_attribute("innerHTML"))

However if you do not get the expected value try the last one.this should print any hidden content. 但是,如果您没有获得期望值,请尝试最后一个,这应该会打印任何隐藏的内容。

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree' and not(@hidden)]/state-modifier[@class='code style-scope classification-tree']/a[@class='style-scope state-modifier']"))).get_attribute("textContent"))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 selenium python 从动态表中提取数据? - How to extract data from a dynamic table with selenium python? 如何使用硒从网站中提取所有动态表数据? - How to extract all dynamic table data from a website using selenium? 使用 Selenium python 从隐藏表中获取数据 - Fetch data from hidden table using Selenium python 如何使用 Selenium 和 Python 从表中捕获隐藏数据? - How do I capture hidden data from a table with Selenium and Python? 如何使用 Selenium 从隐藏图表中提取 tr 表行值? (溢出:隐藏) - How to extract tr table row values using Selenium from a hidden chart? (Overflow:hidden) 如何使用 python selenium 从 span 元素中提取多个文本? - How to extract multiple texts from span elements using python selenium? 如何在Python中使用Selenium提取文本元素? - How to extract the text elements using Selenium in Python? 无法使用 python selenium 提取表数据 - Unable to extract table data using python selenium 使用 Python 和 Selenium 从动态表中抓取数据 - Scrape data from dynamic table using Python & Selenium 如何使用 Selenium 和 Python 从 Morningstar.com 中的表中提取数据 - How to extract data from a table within morningstar.com using Selenium and Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM