简体   繁体   English

Selenium Python - 从 Class 中提取文本

[英]Selenium Python - Extract Text from Class

I'm trying to extract the text from the second "deep" class in the following image.我正在尝试从下图中的第二个“深” class 中提取文本。 In this case it would be the word "Sauvage"在这种情况下,它将是“Sauvage”这个词

The Elements要素

I've done the following:我做了以下事情:

search_perfumes = driver.find_elements(By.XPATH,'//span[@class="deep"][1]')
for perfumes in search_perfumes:
    list_perfumes.append(perfumes.text)

The length of the list correctly shows 23 elements (which is correct since the page has 23 perfumes), but the list has 23 empty elements.列表的长度正确显示了 23 个元素(这是正确的,因为该页面有 23 种香水),但列表有 23 个空元素。 I can't seem to extract the text following the "deep" class.我似乎无法提取“深” class 之后的文本。

Any idea on where I might be going wrong?关于我可能会出错的地方的任何想法?

You are trying to extract text from second web element matching //span[@class="deep"] XPath.您正在尝试从匹配//span[@class="deep"] XPath 的第二个web 元素中提取文本。
You are possibly missing a wait, trying to extract the text before the element completely loaded.您可能错过了等待,试图在元素完全加载之前提取文本。 I'm not sure about that since you don't share all you code.我不确定,因为您不共享所有代码。
Please try this:请试试这个:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 20)

wait.until(EC.visibility_of_element_located((By.XPATH, '//span[@class="deep"]')))
search_perfumes = driver.find_elements(By.XPATH,'//span[@class="deep"]')
for perfumes in search_perfumes:
    list_perfumes.append(perfumes.text)

Based on the Html that you've shared, you can use XPath indexing:基于您分享的Html ,您可以使用 XPath 索引:

(//span[@class='deep'])[2] 

in code:在代码中:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "(//span[@class='deep'])[2]"))).text)

Imports:进口:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Now you must ensure that [2] represent Sauvage in the entire HTML.现在您必须确保[2]代表整个 HTML 中的Sauvage You can increase or decrease indices from [2] to any other matching number.您可以将索引[2]增加到或减少到任何其他匹配的数字。

How would you do that?你会怎么做? - You need to make sure that we have a unique matching node in HTMLDOM. - 您需要确保我们在 HTMLDOM 中有一个唯一的匹配节点。 Please see below for more detailed explanation:请参阅下面的详细说明:

Please check in the dev tools (Google chrome) if we have unique entry in HTML DOM or not.如果我们在HTML DOM中有唯一条目,请检查dev tools (谷歌浏览器)。

Steps to check:检查步骤:

Press F12 in Chrome -> go to element section -> do a CTRL + F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node. Press F12 in Chrome -> go 到element部分 -> 执行CTRL + F -> 然后粘贴xpath并查看,如果您想要的element使用1/1匹配节点突出显示

Also, you can have a list of web elements with this xpath //span[@class='deep']此外,您可以使用此 xpath //span[@class='deep']获得 web 元素的列表

for ele in driver.find_elements(By.XPATH, "//span[@class='deep']"):
    print(ele.text)

Update:更新:

You have to click on Accept all cookie button first which is in shadow root:您必须首先单击位于影子根目录中的Accept all cookie 按钮:

Code:代码:

driver = webdriver.Chrome(driver_path)

driver.maximize_window()
wait = WebDriverWait(driver, 30)

driver.get("https://www.parfumdreams.pt/?m=5&search=sauvage")

try:
    time.sleep(2)
    cookie_btn = driver.execute_script('return document.querySelector("#usercentrics-root").shadowRoot.querySelector("#uc-center-container > div.sc-jJoQJp.dTzACB > div > div > div > button")')
    cookie_btn.click()
    print('Clicked')
except:
    print('Could not click')
    pass


print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "(//span[@class='deep'])[2]"))).text)

or in case you want all of them, instead of the above print command, use the below code:或者如果你想要所有这些,而不是上面的打印命令,使用下面的代码:

for ele in driver.find_elements(By.XPATH, "//span[@class='deep']"):
    driver.execute_script("arguments[0].scrollIntoView(true);", ele)
    print(ele.text)

Output: Output:

DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
Creed
Neroli Sauvage
DIOR
Eau Sauvage
DIOR
Lápis de lábios
DIOR
Lápis de lábios
Estée Lauder
Maquilhagem para lábios

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM