[英]chrome --headless mode not working however normal mode is working fine
I am using below code for chrome --headless mode but code is not executing correctly.我将以下代码用于 chrome --headless 模式,但代码执行不正确。 code is working fine in normal mode.
代码在正常模式下工作正常。
def instagram_login():
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome("/home/chromedriver", options=chrome_options)
driver.get('https://www.instagram.com/')
driver.maximize_window()
driver.implicitly_wait(20)
form = driver.find_element_by_xpath("//*[@class='HmktE']")
usrinput = form.find_element_by_name("username")
usrinput.clear()
usrinput.send_keys("xxxxxx")
usrpwd = form.find_element_by_name("password")
usrpwd.clear()
usrpwd.send_keys("xxxxx")
time.sleep(2)
loginbt = form.find_elements_by_tag_name('button')
loginbt[1].click()
time.sleep(5)
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[1]/section/main/div/div/div/div/button"))).click()
time.sleep(2)
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Not Now']"))).click()
return driver
Please find the error below:请在下面找到错误:
Traceback (most recent call last):
File "/home/Instagram/insta.py", line 539, in <module>
(driver, postauth, hlist) = get_instalinks(x)
File "/home//PycharmProjects(SEP)/Instagram/insta.py", line 76, in get_instalinks
driver = instagram_login()
File "/home/Instagram/insta_.py", line 56, in instagram_login
wait.until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Not Now']"))).click()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/support/wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
The problem is with your User-Agent .问题出在您的User-Agent上。 Some websites check you user agent when you browse to reduce the use of scrapers.
有些网站会在您浏览时检查您的用户代理,以减少刮板的使用。 If they notice anything suspicious, they will limit (or fully restrict) your activity on such page.
如果他们发现任何可疑之处,他们将限制(或完全限制)您在该页面上的活动。
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Headless Chrome/96.0.4664.45 Safari/537.36 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Headless Chrome/96.0.4664.45 Safari/537.36
Instagram recognizes a faulty user agent and restricts access. Instagram识别出错误的用户代理并限制访问。 You should implement the following Chrome Option to evade this restriction:
您应该实现以下 Chrome 选项来规避此限制:
chrome_options.add_argument("USER AGENT")
Replacing the above "USER AGENT" with the contents seen from this link: My User Agent用从这个链接看到的内容替换上面的“用户代理”:我的用户代理
Further more, for an additional layer of added security, I recommend following the contents of this article on how to make your scraper as undetectable as possible when browsing in headless mode.此外,为了增加一层额外的安全性,我建议您阅读本文的内容,了解如何在无头模式下浏览时尽可能地使您的抓取工具无法检测到。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.