简体   繁体   中英

Selenium 4 in Python works with Edge option headless False, but not with True

So I have function which takes some information from a website ( https://www.fragrantica.com/perfume/Dior/Sauvage-Eau-de-Parfum-48100.html ; and I want to take ratings). I have selenium 4 installed and webdriver_manager to take care of my drivers, among others.

When I use the headless option I get the 'Unable to locate element' error, but when I comment it out it works just fine. I tried using Edge headless for another site (but that was a week ago) and it seemed to work. Here is the code:

import os
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.edge.options import Options
from webdriver_manager.microsoft import EdgeChromiumDriverManager


def get_info(url):
    '''Get all the ratings from fragrantica site.'''
    os.environ['WDM_LOCAL'] = '1'
    os.environ['WDM_LOG_LEVEL'] = '0'
    options = Options()
    options.headless = True
    options.add_experimental_option('excludeSwitches', ['enable-logging'])

    driver = webdriver.Edge(service=Service(
        EdgeChromiumDriverManager().install()), options=options)

    try:
        driver.get(url)
        lst = []
        name = driver.find_element(
            By.XPATH, "//h1[contains(@class,'text-center medium-text-left')]").text
        WebDriverWait(driver, 30).until(ec.presence_of_element_located((By.XPATH, '//*[@id="main-content'
                                                                                  '"]/div[1]/div['
                                                                                  '1]/div/div[2]/div['
                                                                                  '4]/div[2]/div/div['
                                                                                  '1]/div[3]/div/div')))
        ratings = driver.find_elements(By.XPATH,
                                       './/div[@style="width: 100%; height: 0.3rem; border-radius: 0.2rem; '
                                       'background: rgba(204, 224, 239, 0.4);"]')
        votes = driver.find_element(
            By.XPATH, "//span[contains(@itemprop,'ratingCount')]").text
        for style in ratings:
            lst.append(style.find_element(
                By.TAG_NAME, 'div').get_attribute('style'))
        driver.quit()
        return name, lst, votes
    except:
        driver.quit()
        raise

Do you guys have any idea how to work around this? I have been trying to find an explanation, but with no success. It would be inconvenient to have the browser pop up all the time.

Thank you very much!

I ran into this kind of issue before. In that case, the cause of the issue is Edge using an older version of browser in headless mode. The rendering page is different in old version, so the element can't be located.

I think the cause of your issue also might be this. You can try to override the user-agent by adding the user-agent argument user-agent=User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/101.0.1210.32 to fix the issue. You can change the Edge version in the user-agent to your owns.

Ref link: Running Selenium Headless Tests on Outlook shows older version of outlook on Edge chromium browser

Import Edge Options from selenium.webdriver and pass headless arg using add_argument & Update Selenium to Latest

from selenium.webdriver import EdgeOptions
options = EdgeOptions()
options.add_argument("--headless")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM