简体   繁体   English

如何在不单击保存按钮的情况下使用 Python 和 Selenium 自动下载文件?

[英]How do i automatically download file using Python and Selenium without clicking Save button?

Is there a way to automatically download a file using Python/Selenium in headless mode?有没有办法在无头模式下使用 Python/Selenium 自动下载文件?

When headless on I can use autoit to click on a button.当无头on我可以使用 autoit 点击按钮。 But when headless off - autoit cannot find SaveAs window:但是当无头off - autoit 找不到 SaveAs window:

raise AutoItError(err_msg)
autoit.autoit.AutoItError: Window/Control could not be found

Is there a way to provide url to a file and download it using just driver.get("https://.../myfile.xls") ?有没有办法将 url 提供给文件并仅使用driver.get("https://.../myfile.xls")下载它?

I figured it.我想通了。 You have to use Selenium driver prefs:您必须使用 Selenium 驱动程序首选项:

    prefs = {'download.default_directory': download_location,
             'download.prompt_for_download': False,
             'download.directory_upgrade': True,
             'safebrowsing.enabled': False,
             'safebrowsing.disable_download_protection': True}

Use headless:使用无头:

chrome_options.add_argument("--headless")

Now you will be able to save it without clicking Save button in dialog.现在您无需单击对话框中的保存按钮即可保存它。

Full solution includes 2 files - DriverBuilder.py and TestDownload.py Test:完整的解决方案包括 2 个文件 - DriverBuilder.pyTestDownload.py测试:

if __name__ == "__main__":
    td= TestDownload()
    url = 'https://test.com/test.xlsx'
    td.test_download(url)

DriverBuilder.py DriverBuilder.py

import os
import sys

from selenium.webdriver import Chrome
from selenium.webdriver.chrome import webdriver as chrome_webdriver


class DriverBuilder():
    def get_driver(self, download_location=None, headless=False):

        driver = self._get_chrome_driver(download_location, headless)

        driver.set_window_size(1400, 700)

        return driver

    def _get_chrome_driver(self, download_location, headless):
        chrome_options = chrome_webdriver.Options()
        if download_location:
            prefs = {'download.default_directory': download_location,
                     'download.prompt_for_download': False,
                     'download.directory_upgrade': True,
                     'safebrowsing.enabled': False,
                     'safebrowsing.disable_download_protection': True}

            chrome_options.add_experimental_option('prefs', prefs)

        if headless:
            chrome_options.add_argument("--headless")

        dir_path = os.path.dirname(os.path.realpath(__file__))
        driver_path = r"C:\Users\H30801\mygit\driver\chromedriver_83"

        if sys.platform.startswith("win"):
            driver_path += ".exe"

        driver = Chrome(executable_path=driver_path, chrome_options=chrome_options)

        if headless:
            self.enable_download_in_headless_chrome(driver, download_location)

        return driver

    def enable_download_in_headless_chrome(self, driver, download_dir):

        # add missing support for chrome "send_command"  to selenium webdriver
        driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

        params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
        command_result = driver.execute("send_command", params)
        print("response from browser:")
        for key in command_result:
            print("result:" + key + ":" + str(command_result[key]))

TestDownload.py测试下载.py

from os import path, remove
from time import sleep
from pprint import pprint as pp
from driver_builder import DriverBuilder


class TestDownload:
    def __init__(self):
        pass
    def test_download(self, url):

        driver_builder = DriverBuilder()

        download_path = r"C:\Users\H30801\mygit"

        expected_download = path.join(download_path, "test.xlsx")

        # clean downloaded file
        try:
            remove(expected_download)
        except OSError:
            pass

        assert (not path.isfile(expected_download))

        driver = driver_builder.get_driver(download_path, headless=True)


        driver.get(url)

        self.wait_until_file_exists(expected_download, 30)
        driver.close()

        assert (path.isfile(expected_download))

        print("done")

    def wait_until_file_exists(self, actual_file, wait_time_in_seconds=5):
        waits = 0
        while not path.isfile(actual_file) and waits < wait_time_in_seconds:
            print("sleeping...." + str(waits))
            pp(actual_file)
            sleep(.5)  # make sure file completes downloading
            waits += .5

if __name__ == "__main__":
    td= TestDownload()
    url = 'https://test.com/test.xlsx'
    td.test_download(url)  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 通过单击不使用 url 使用 Selenium 和 Python 下载文件 - Download a file by clicking without url using Selenium and Python 通过使用Python单击下载按钮来下载文件 - Download a file by clicking on download button using Python 如何检查 URL 是否会在 Python 中使用 Selenium 下载文件? - How do I check if a URL will download a file using Selenium in Python? 单击没有 ID 或唯一名称的下载按钮 selenium python - Clicking a download button without ID or unique name selenium python 如何使用 selenium-python 从弹出对话框自动下载文件 - How do I automatically download files from a pop up dialog using selenium-python 如何在不打开对话框的情况下通过 Selenium (Python) 自动下载 CSV 文件 - How to download automatically a CSV file via Selenium (Python) without opening the dialog box 使用不使用selenium的Python下载文件,例如Chrome的“Save Link As” - Download a file using Python without selenium like Chrome's “Save Link As” 我如何使用python和selenium自动登录网站 - how do i automatically login into website using python and selenium 使用Selenium单击按钮Python - Using Selenium for clicking on button Python 使用硒单击按钮[Python] - Clicking on a button using selenium [Python]
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM