如何在不单击保存按钮的情况下使用 Python 和 Selenium 自动下载文件？

Question

Is there a way to automatically download a file using Python/Selenium in headless mode?有没有办法在无头模式下使用 Python/Selenium 自动下载文件？

When headless on I can use autoit to click on a button.当无头on我可以使用 autoit 点击按钮。 But when headless off - autoit cannot find SaveAs window:但是当无头off - autoit 找不到 SaveAs window：

raise AutoItError(err_msg)
autoit.autoit.AutoItError: Window/Control could not be found

Is there a way to provide url to a file and download it using just driver.get("https://.../myfile.xls") ?有没有办法将 url 提供给文件并仅使用driver.get("https://.../myfile.xls")下载它？

Answer 1

I figured it.我想通了。 You have to use Selenium driver prefs:您必须使用 Selenium 驱动程序首选项：

    prefs = {'download.default_directory': download_location,
             'download.prompt_for_download': False,
             'download.directory_upgrade': True,
             'safebrowsing.enabled': False,
             'safebrowsing.disable_download_protection': True}

Use headless:使用无头：

chrome_options.add_argument("--headless")

Now you will be able to save it without clicking Save button in dialog.现在您无需单击对话框中的保存按钮即可保存它。

Full solution includes 2 files - DriverBuilder.py and TestDownload.py Test:完整的解决方案包括 2 个文件 - DriverBuilder.py和TestDownload.py测试：

if __name__ == "__main__":
    td= TestDownload()
    url = 'https://test.com/test.xlsx'
    td.test_download(url)

DriverBuilder.py DriverBuilder.py

import os
import sys

from selenium.webdriver import Chrome
from selenium.webdriver.chrome import webdriver as chrome_webdriver


class DriverBuilder():
    def get_driver(self, download_location=None, headless=False):

        driver = self._get_chrome_driver(download_location, headless)

        driver.set_window_size(1400, 700)

        return driver

    def _get_chrome_driver(self, download_location, headless):
        chrome_options = chrome_webdriver.Options()
        if download_location:
            prefs = {'download.default_directory': download_location,
                     'download.prompt_for_download': False,
                     'download.directory_upgrade': True,
                     'safebrowsing.enabled': False,
                     'safebrowsing.disable_download_protection': True}

            chrome_options.add_experimental_option('prefs', prefs)

        if headless:
            chrome_options.add_argument("--headless")

        dir_path = os.path.dirname(os.path.realpath(__file__))
        driver_path = r"C:\Users\H30801\mygit\driver\chromedriver_83"

        if sys.platform.startswith("win"):
            driver_path += ".exe"

        driver = Chrome(executable_path=driver_path, chrome_options=chrome_options)

        if headless:
            self.enable_download_in_headless_chrome(driver, download_location)

        return driver

    def enable_download_in_headless_chrome(self, driver, download_dir):

        # add missing support for chrome "send_command"  to selenium webdriver
        driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

        params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
        command_result = driver.execute("send_command", params)
        print("response from browser:")
        for key in command_result:
            print("result:" + key + ":" + str(command_result[key]))

TestDownload.py测试下载.py

from os import path, remove
from time import sleep
from pprint import pprint as pp
from driver_builder import DriverBuilder


class TestDownload:
    def __init__(self):
        pass
    def test_download(self, url):

        driver_builder = DriverBuilder()

        download_path = r"C:\Users\H30801\mygit"

        expected_download = path.join(download_path, "test.xlsx")

        # clean downloaded file
        try:
            remove(expected_download)
        except OSError:
            pass

        assert (not path.isfile(expected_download))

        driver = driver_builder.get_driver(download_path, headless=True)


        driver.get(url)

        self.wait_until_file_exists(expected_download, 30)
        driver.close()

        assert (path.isfile(expected_download))

        print("done")

    def wait_until_file_exists(self, actual_file, wait_time_in_seconds=5):
        waits = 0
        while not path.isfile(actual_file) and waits < wait_time_in_seconds:
            print("sleeping...." + str(waits))
            pp(actual_file)
            sleep(.5)  # make sure file completes downloading
            waits += .5

if __name__ == "__main__":
    td= TestDownload()
    url = 'https://test.com/test.xlsx'
    td.test_download(url)

如何在不单击保存按钮的情况下使用 Python 和 Selenium 自动下载文件？

问题描述

1 个解决方案

解决方案1
0 2020-06-04 14:54:04

DriverBuilder.py DriverBuilder.py

TestDownload.py测试下载.py

如何在不单击保存按钮的情况下使用 Python 和 Selenium 自动下载文件？

问题描述

1 个解决方案

解决方案1 0 2020-06-04 14:54:04

DriverBuilder.py DriverBuilder.py

TestDownload.py测试下载.py

解决方案1
0 2020-06-04 14:54:04