简体   繁体   中英

Selenium: access denied

I am trying to scrape some data from LV website with Selenium and keep getting 'Access Denied' screen once 'sign in' button clicked. I feel like there is a protection against this because all seems to be working fine when I do the same manually. Oddly, I need to click 'sign in' button twice to be able to sign in manually.

My code:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'chromedriver.exe')
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//span[@class='ucm-wrapper']")))
driver.find_element_by_xpath("//button[@class='ucm-button ucm-button--default ucm-choice__yes']").click()
driver.find_element_by_id ('passwordloginForm').send_keys('xxxxxx')


You don't have permission to access "http://secure.louisvuitton.com/eng-gb/mylv;jsessionid=xxxxxxx.front61-prd?" on this server.

Is there a way to login with Selenium and bypass this?

I took your code added a few tweaks and ran the test as follows:

  • Code Block:

     from selenium import webdriver driver.get('https://secure.louisvuitton.com/eng-gb/mylv') WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[text()='Accept and Continue']"))).click() WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@id='loginloginForm']"))).send_keys("Mudyla@stackoverflow.com") driver.find_element_by_xpath("//input[@id='passwordloginForm']").send_keys('Mudyla') driver.find_element_by_xpath("//input[@id='loginSubmit_']").click()


Similar to your observation, I have hit the same roadblock with no results as follows:


Deep Dive

It seems the click() on Sign In does happens. But while inspecting the DOM Tree of the webpage you will find that some of the <script> tag refers to JavaScripts having keyword akam . As an example:

  • akam-sw.js install script version 1.3.3 "serviceWorker"in navigator&&"find"in[]&&function()
  • <script type="text/javascript" src="https://secure.louisvuitton.com/akam/11/7f0e2ae6" defer=""></script>
  • <noscript><img src="https://secure.louisvuitton.com/akam/11/pixel_7f0e2ae6?a=dD0xOWNjNTRjMmMxYzdmNmMwZjI0NTUwOGZmZDM5ZTQzMWQ5NjI5ZmIwJmpzPW9mZg==" style="visibility: hidden; position: absolute; left: -999px; top: -999px;" /></noscript>

Which is a clear indication that the website is protected by Bot Manager an advanced bot detection service provided by Akamai and the response gets blocked .

Bot Manager

As per the article Bot Manager - Foundations :



So it can be concluded that the request for the data is detected as being performed by Selenium driven WebDriver instance and the response is blocked.


A couple of documentations:

tl; dr

A couple of relevant discussions:

It's been a while since I had posted this question but if anyone is interested below are the steps I've taken to solve the problem.

  1. Open chromedriver.exe in hex editor, find the string $cdc and replace with something else of the same length. Then save and run modified binary. Read more in this answer and the replies to it.

  2. Selenium python code:

options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path='chromedriver.exe')
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
                                                                     'AppleWebKit/537.36 (KHTML, like Gecko) '
                                                                     'Chrome/85.0.4183.102 Safari/537.36'})



The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM