![](/img/trans.png)
[英]Can't find any element using CSS selector with Selenium on Python
[英]Can't find element by through any locator using Selenium in Python
我正在尝试在带有PyCharm 2019.2。的Python中使用硒来抓取https://patents.google.com/patent/US4718386
内容。 特别是,我需要分类代码+标题(A23L3 / 358-无机化合物)。
Google Patents最近更改了此元素,因此我之前的代码无法再捕获其内容。
现在的HTML是:
<div class="style-scope classification-tree">
<concept-mention class="style-scope classification-tree">
<span id="target" tabindex="0" aria-label="Details of concept" role="link" class="style-scope concept-mention">
<iron-icon class="inline-icon style-scope concept-mention x-scope iron-icon-0" icon="icons:label"><svg viewBox="0 0 24 24" preserveAspectRatio="xMidYMid meet" focusable="false" class="style-scope iron-icon" style="pointer-events: none; display: block; width: 100%; height: 100%;"><g class="style-scope iron-icon"><path d="M17.63 5.84C17.27 5.33 16.67 5 16 5L5 5.01C3.9 5.01 3 5.9 3 7v10c0 1.1.9 1.99 2 1.99L16 19c.67 0 1.27-.33 1.63-.84L22 12l-4.37-6.16z" class="style-scope iron-icon"></path></g></svg>
</iron-icon>
<template is="dom-if" class="style-scope concept-mention"></template>
<state-modifier class="code style-scope classification-tree" act="{"type": "QUERY_ADD_CPC", "cpc": "$cpc"}" first="true" data-cpc="A23L3/358"><a id="link" href="/?q=APPLE&q=A23L3%2f358" class="style-scope state-modifier">A23L3/358</a></state-modifier>
<span class="description style-scope classification-tree">Inorganic compounds</span>
<template is="dom-if" restamp="" class="style-scope concept-mention"></template>
</span>
</concept-mention>
</div>
这是我以前使用的代码:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
Class_Content_year = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree' and not(@hidden)]/state-modifier[@class='code style-scope classification-tree']/a[@id='link' and @class='style-scope state-modifier']"))).get_attribute("innerHTML")
Class_Content_title = WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree' and not (@hidden)]/span[@class='description style-scope classification-tree']"))).get_attribute("innerHTML")
我希望它至少还能找到标题,但是由于某种原因,它找不到。 有人可以帮忙吗?
谢谢!
使用以下find_element_by
类名:
driver.find_elements_by_class_name("style-scope classification-tree");
通过XPATH
您还可以获取id和class,但是您必须手动放置许多内容。
要提取文本A23L3 / 358和无机化合物,您必须为visible_of_element_located visibility_of_element_located()
诱导WebDriverWait ,并且可以使用以下两种定位策略之一 :
提取A23L3 / 358 :
使用CSS_SELECTOR
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.classification-tree span.concept-mention state-modifier>a.state-modifier"))).get_attribute("innerHTML"))
使用XPATH
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree']//span[@class='style-scope concept-mention']//state-modifier/a[@class='style-scope state-modifier']"))).get_attribute("innerHTML"))
提取无机化合物 :
使用CSS_SELECTOR
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.classification-tree span.concept-mention span.description.classification-tree"))).get_attribute("innerHTML"))
使用XPATH
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='style-scope classification-tree']//span[@class='style-scope concept-mention']//span[@class='description style-scope classification-tree']"))).get_attribute("innerHTML"))
注意 :您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.