简体   繁体   中英

Cannot access Amazon With Selenium, please Enable Cookies

I am trying to purchase a product on Amazon through selenium but i cannot login to my account, when i try to login i get the "Please Enable Cookies" message, i am using chromedriver which should have cookies enabled, my code is below:

import logging
import time
import uuid
from pyvirtualdisplay import Display
from selenium import webdriver
from selenium.webdriver.common.keys import Keys


class BB_Scraper:
    def __init__(self, item):
        self.item = item
        self.driver = self.create_driver()
        self.pg_counter = 1
        self.name = str(uuid.uuid4())

    def create_driver(self):
        display = Display(visible=0, size=(1024, 768))
        display.start()
        chrome_options = webdriver.ChromeOptions()
        chrome_options.add_argument('--no-sandbox')
        chrome_options.add_argument('--disable-dev-shm-usage')
        chrome_options.add_argument('user-data-dir=selenium')
        driver = webdriver.Chrome(
                executable_path=r"/app/src/chromedriver_nix",
                options=chrome_options)
        print('driver created')
        return driver

    def save_page(self):
        with open(f'{self.name}_{self.pg_counter}.html', 'w+') as output:
            output.write(self.driver.page_source)
            self.pg_counter += 1


    def run(self):
        self.driver.get('https://www.amazon.com')
        self.save_page()
        time.sleep(5)
        search_bar = self.driver.find_element_by_id("twotabsearchtextbox")
        search_bar.send_keys(self.item)
        search_bar.send_keys(Keys.ENTER)
        time.sleep(5)
        self.save_page()
        self.driver.find_element_by_link_text(self.item).click()
        time.sleep(5)
        self.save_page()
        self.driver.find_element_by_id('buy-now-button').click()
        time.sleep(5)
        self.save_page()
        self.driver.close()


if __name__ == '__main__':
    scraper = BB_Scraper('HP Printer Paper 8.5 x 11 | 20 lb - 1 ream - 500 Sheets | 92 Bright - Made in USA | FSC Certified Copy Paper | HP Compatible 172160R')
    scraper.run()

Try adding user-agent by this line

chrome_options.add_argument("user-agent=UA")

Also remember, Amazon has good anti-bot detection mechanism, most probably your IP is blocked, if you will try after 20-25 mins, you will see it might work but then it will be blocked after 2-3 mins.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM