Python Selenium Chromedriver 无法使用 --headless 选项

Question

我正在运行 chromedriver 以尝试从网站上抓取一些数据。 没有无头选项，一切正常。 但是，当我添加该选项时，webdriver 需要很长时间才能加载 url，并且当我尝试查找一个元素（在没有 --headless 的情况下运行时找到）时，我收到错误消息。

使用 print 语句并在 url“加载”后获取 html，我发现没有 html，它是空的（见下面的 output）。

class Fidelity:
    def __init__(self):
        self.url = 'https://eresearch.fidelity.com/eresearch/gotoBL/fidelityTopOrders.jhtml'
        self.options = Options()
        self.options.add_argument("--headless")
        self.options.add_argument("--window-size=1500,1000")
        self.driver = webdriver.Chrome(executable_path='.\\dependencies\\chromedriver.exe', options = self.options)
        print("init")

    def initiate_browser(self):
        self.driver.get(self.url)
        time.sleep(5)
        script = self.driver.execute_script("return document.documentElement.outerHTML")
        print(script)
        print("got url")

    def find_orders(self):
        wait = WebDriverWait(self.driver, 15)
        data= wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]'))) #ERROR ON THIS LINE

这是整个 output：

init
<html><head></head><body></body></html>
url
Traceback (most recent call last):
  File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 102, in <module>
    orders = scrape.find_tesla_orders()
  File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 75, in find_tesla_orders
    tesla = self.driver.find_element_by_xpath("//a[@href='https://qr.fidelity.com/embeddedquotes/redirect/research?symbol=TSLA']")
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
    'value': value})['value']
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//a[@href='https://qr.fidelity.com/embeddedquotes/redirect/research?symbol=TSLA']"}
  (Session info: headless chrome=74.0.3729.169)
  (Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Windows NT 10.0.17763 x86_64)

更新代码的新错误：

init
<html><head></head><body></body></html>
url
Traceback (most recent call last):
  File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 104, in <module>
    orders = scrape.find_tesla_orders()
  File "C:\Users\Zachary\Documents\Python\Tesla Stock Info\Scraper.py", line 76, in find_tesla_orders
    tesla = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]')))
  File "C:\Program Files (x86)\Python37-32\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:

我试过通过谷歌找到这个问题的答案，但没有一个建议有效。 是否有其他人在某些网站上遇到此问题？ 任何帮助表示赞赏。

更新

不幸的是，这个脚本仍然无法运行，webdriver 在无头时由于某种原因没有正确加载页面，即使在没有使用无头选项运行它的情况下一切正常。

Answer 1

对于将来想解决此问题的任何人来说，某些网站只是无法使用 chrome 的无头选项正确加载。 我认为没有办法解决这个问题。 只需使用不同的浏览器（如 Firefox）。 感谢 user8426627 为此。

Answer 2

您是否尝试过使用用户代理？

我遇到了同样的错误。 首先我做的是下载无头和普通的 HTML 源页面：

html = driver.page_source
file = open("foo.html","w")
file.write(html)
file.close()

无头模式的 HTML 源代码是一个简短的文件，几乎在末尾有一行： The page cannot be displayed. Please contact the administrator for additional information. The page cannot be displayed. Please contact the administrator for additional information. 但正常模式是预期的 HTML。

我通过添加用户代理解决了这个问题：

from fake_useragent import UserAgent
user_agent = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f'user-agent={user_agent}')
driver = webdriver.Chrome(executable_path = f"your_path",chrome_options=chrome_options)

Answer 3

添加显式等待。 您还应该使用另一个定位器，当前定位器匹配 3 个元素。 该元素具有唯一的 id 属性

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.common.by import By

wait = WebDriverWait(self.driver, timeout)
data = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]')))

Answer 4

我需要在不离开Google浏览器的情况下从同一控制台运行脚本，但是浏览器仍与我的程序一起运行

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("window-size=1920,1080")
print("complete")

driver = webdriver.Chrome('C:\proyectos\python-selenium\driver\chromedriver.exe')
driver.get('https://www.facebook.com/')

Answer 5

尝试设置 window 大小以及无头。 添加这个：

chromeOptions.add_argument("--window-size=1920,1080")

无头浏览器的默认大小很小。 如果代码在未启用无头时有效，则可能是因为您的 object 位于 window 之外。

Answer 6

有些网站无法使用 Chrome 的无头选项正确加载。

前面的说法其实是错误的。 我刚遇到这个问题，Chrome 没有检测到这些元素。 当我看到@LuckyZakary 的回答时，我感到很震惊，因为有人用nodeJs为同一个网站创建了一个报废，但没有收到这个错误。

@AtulGumar 回答对 Windows 有帮助，但在 Ubuntu 服务器上它失败了。 所以这还不够。 读完这篇文章后，归根结底，@AtulGumar 错过的是添加–disable-gpu标志。

所以它在 Windows 和 Ubuntu 服务器上对我有用，没有带有这些选项的 GUI：

webOptions = webdriver.ChromeOptions()
webOptions.headless = True
webOptions.add_argument("--window-size=1920,1080")
webOptions.add_argument("–disable-gpu")
driver = webdriver.Chrome(options=webOptions)

我还按照此处的建议安装了xvfb和其他软件包：

sudo apt-get -y install xorg xvfb gtk2-engines-pixbuf
sudo apt-get -y install dbus-x11 xfonts-base xfonts-100dpi xfonts-75dpi xfonts-cyrillic xfonts-scalable

并执行：

Xvfb -ac :99 -screen 0 1280x1024x16 &
export DISPLAY=:99

Answer 7

强文本尝试将可执行路径添加到服务 object

options =  Options()
options.add_argument('---incognito')
options.add_argument('---disable-extension')
options.add_argument("--no-sandbox")
options.add_argument('-–disable-gpu')
options.add_argument('--headless')
service = Service (executable_path=ChromeDriverManager().install() )
return webdriver.Chrome(service=service  , options=options)

它对我有用:)

Python Selenium Chromedriver 无法使用 --headless 选项

问题描述

更新

6 个解决方案

解决方案1
9 已采纳 2019-06-08 02:17:13

解决方案2
2 2021-04-04 22:01:45

解决方案3
0 2019-06-04 04:45:24

解决方案4
0 2019-11-06 05:54:28

解决方案5
0 2022-09-03 09:32:57

解决方案6
0 2023-01-20 09:29:08

解决方案7
0 2023-01-24 06:09:01

Python Selenium Chromedriver 无法使用 --headless 选项

问题描述

更新

6 个解决方案

解决方案1 9 已采纳 2019-06-08 02:17:13

解决方案2 2 2021-04-04 22:01:45

解决方案3 0 2019-06-04 04:45:24

解决方案4 0 2019-11-06 05:54:28

解决方案5 0 2022-09-03 09:32:57

解决方案6 0 2023-01-20 09:29:08

解决方案7 0 2023-01-24 06:09:01

解决方案1
9 已采纳 2019-06-08 02:17:13

解决方案2
2 2021-04-04 22:01:45

解决方案3
0 2019-06-04 04:45:24

解决方案4
0 2019-11-06 05:54:28

解决方案5
0 2022-09-03 09:32:57

解决方案6
0 2023-01-20 09:29:08

解决方案7
0 2023-01-24 06:09:01