简体   繁体   中英

Parse data from leboncoin get stuck in captcha with python selenium

I am using selenium to parse data from the french site leboncoin.fr with python and selenium. I have tried many solutions that I have found here in StackOverflow like this one . Nonetheless I keep getting stuck in the captcha, I solve it manually to continue but then it launches it again non-stop so I can never reach the page itself.

Is there any other way to parse this web or to avoid getting stuck like this?

I have also tried this code:

options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'./chromedriver')
driver.get('https://www.leboncoin.fr/')

With just selenium, you cant avoid the captcha because leboncoin uses Datadome protection .

I suggest you to take another approach, ip rotating for example, but its not easy and 100% functional. you could search all informations about selenium and datadome, but i have not found things resolving your problem.

Some informations here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM