繁体   English   中英

亚马逊将Selenium检测为机器人

[英]Amazon is detecting Selenium as bot

亚马逊将Selenium检测为机器人,因此我更改了useragent,但问题仍然存在。

我正在从5个不同的站点抓取数据。 他们是亚马逊(com,mx,uk,au,ae,ca),但是我在(com,mx,ca)上有这个问题,这些页面数据没有加载,亚马逊认为我是一个机器人。 这些站点上没有相关数据。 怎么知道我正在使用硒?

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd

from selenium.webdriver.chrome.options import Options
from fake_useragent import UserAgent
options = Options()
ua = UserAgent()
a = ua.random
user_agent = ua.random
#print(user_agent)
options.add_argument(f'user-agent={user_agent}')

chrome_driver_path = r'C:\chromedriver_win32\chromedriver.exe'
driver = webdriver.Chrome(chrome_driver_path, chrome_options = options)



urls = ['https://www.amazon.com/gp/offer-listing/','https://www.amazon.ca/gp/offer-listing/','https://www.amazon.co.uk/gp/offer-listing/','https://www.amazon.ae/gp/offer-listing/','https://www.amazon.com.au/gp/offer-listing/','https://www.amazon.com.mx/gp/offer-listing/']
marketler = ['USA','CA','UK','AE','AU','MX']
asins = ['B07BGLT25K']
OfferData = []
def offerlisting():
        soup = BeautifulSoup(driver.page_source, 'lxml')
        for i in range(len(asins)):
            offerlisting = asins[i]
            no = 0
            for url in urls:
                url2 = url+str(offerlisting)
                driver.get(url2)

                soup = BeautifulSoup(driver.page_source, 'lxml')
                sellers = soup.find_all('div', class_='olpOffer')
                print('Satıcı Sayısı:',len(sellers)) 
                OfferData.append({"Asin":asins[i],"Satıcı Sayısı": len(sellers),"Market": marketler[no]})   
                seller = soup.find_all(class_="a-spacing-none olpSellerName")
                seller2 = [o.get_text().strip().replace('\n', '') for o in seller] 
                print(seller2)
                OfferData.append({"Satıcılar": seller2})
                no += 1


def save():
    df=pd.DataFrame(OfferData, columns = ['Asin','Market','Satıcı Sayısı','Satıcılar'] )
    df_nan_sil = df.apply(lambda x: pd.Series(x.dropna().values))
    df_nan_sil.to_excel('C:/chromedriver_win32/amazon_karsilastirma.xlsx', encoding='utf-8-sig', index=False, header=True)



offerlisting()
save()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM