简体   繁体   中英

Selenium Python - Extract Text from Class

I'm trying to extract the text from the second "deep" class in the following image. In this case it would be the word "Sauvage"

The Elements

I've done the following:

search_perfumes = driver.find_elements(By.XPATH,'//span[@class="deep"][1]')
for perfumes in search_perfumes:
    list_perfumes.append(perfumes.text)

The length of the list correctly shows 23 elements (which is correct since the page has 23 perfumes), but the list has 23 empty elements. I can't seem to extract the text following the "deep" class.

Any idea on where I might be going wrong?

You are trying to extract text from second web element matching //span[@class="deep"] XPath.
You are possibly missing a wait, trying to extract the text before the element completely loaded. I'm not sure about that since you don't share all you code.
Please try this:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 20)

wait.until(EC.visibility_of_element_located((By.XPATH, '//span[@class="deep"]')))
search_perfumes = driver.find_elements(By.XPATH,'//span[@class="deep"]')
for perfumes in search_perfumes:
    list_perfumes.append(perfumes.text)

Based on the Html that you've shared, you can use XPath indexing:

(//span[@class='deep'])[2] 

in code:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "(//span[@class='deep'])[2]"))).text)

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Now you must ensure that [2] represent Sauvage in the entire HTML. You can increase or decrease indices from [2] to any other matching number.

How would you do that? - You need to make sure that we have a unique matching node in HTMLDOM. Please see below for more detailed explanation:

Please check in the dev tools (Google chrome) if we have unique entry in HTML DOM or not.

Steps to check:

Press F12 in Chrome -> go to element section -> do a CTRL + F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node.

Also, you can have a list of web elements with this xpath //span[@class='deep']

for ele in driver.find_elements(By.XPATH, "//span[@class='deep']"):
    print(ele.text)

Update:

You have to click on Accept all cookie button first which is in shadow root:

Code:

driver = webdriver.Chrome(driver_path)

driver.maximize_window()
wait = WebDriverWait(driver, 30)

driver.get("https://www.parfumdreams.pt/?m=5&search=sauvage")

try:
    time.sleep(2)
    cookie_btn = driver.execute_script('return document.querySelector("#usercentrics-root").shadowRoot.querySelector("#uc-center-container > div.sc-jJoQJp.dTzACB > div > div > div > button")')
    cookie_btn.click()
    print('Clicked')
except:
    print('Could not click')
    pass


print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "(//span[@class='deep'])[2]"))).text)

or in case you want all of them, instead of the above print command, use the below code:

for ele in driver.find_elements(By.XPATH, "//span[@class='deep']"):
    driver.execute_script("arguments[0].scrollIntoView(true);", ele)
    print(ele.text)

Output:

DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
DIOR
Eau Sauvage
Creed
Neroli Sauvage
DIOR
Eau Sauvage
DIOR
Lápis de lábios
DIOR
Lápis de lábios
Estée Lauder
Maquilhagem para lábios

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM