[英]How do i automatically download file using Python and Selenium without clicking Save button?
Is there a way to automatically download a file using Python/Selenium in headless mode?有没有办法在无头模式下使用 Python/Selenium 自动下载文件?
When headless on
I can use autoit to click on a button.当无头
on
我可以使用 autoit 点击按钮。 But when headless off
- autoit cannot find SaveAs window:但是当无头
off
- autoit 找不到 SaveAs window:
raise AutoItError(err_msg)
autoit.autoit.AutoItError: Window/Control could not be found
Is there a way to provide url to a file and download it using just driver.get("https://.../myfile.xls")
?有没有办法将 url 提供给文件并仅使用
driver.get("https://.../myfile.xls")
下载它?
I figured it.我想通了。 You have to use Selenium driver prefs:
您必须使用 Selenium 驱动程序首选项:
prefs = {'download.default_directory': download_location,
'download.prompt_for_download': False,
'download.directory_upgrade': True,
'safebrowsing.enabled': False,
'safebrowsing.disable_download_protection': True}
Use headless:使用无头:
chrome_options.add_argument("--headless")
Now you will be able to save it without clicking Save button in dialog.现在您无需单击对话框中的保存按钮即可保存它。
Full solution includes 2 files - DriverBuilder.py and TestDownload.py Test:完整的解决方案包括 2 个文件 - DriverBuilder.py和TestDownload.py测试:
if __name__ == "__main__":
td= TestDownload()
url = 'https://test.com/test.xlsx'
td.test_download(url)
import os
import sys
from selenium.webdriver import Chrome
from selenium.webdriver.chrome import webdriver as chrome_webdriver
class DriverBuilder():
def get_driver(self, download_location=None, headless=False):
driver = self._get_chrome_driver(download_location, headless)
driver.set_window_size(1400, 700)
return driver
def _get_chrome_driver(self, download_location, headless):
chrome_options = chrome_webdriver.Options()
if download_location:
prefs = {'download.default_directory': download_location,
'download.prompt_for_download': False,
'download.directory_upgrade': True,
'safebrowsing.enabled': False,
'safebrowsing.disable_download_protection': True}
chrome_options.add_experimental_option('prefs', prefs)
if headless:
chrome_options.add_argument("--headless")
dir_path = os.path.dirname(os.path.realpath(__file__))
driver_path = r"C:\Users\H30801\mygit\driver\chromedriver_83"
if sys.platform.startswith("win"):
driver_path += ".exe"
driver = Chrome(executable_path=driver_path, chrome_options=chrome_options)
if headless:
self.enable_download_in_headless_chrome(driver, download_location)
return driver
def enable_download_in_headless_chrome(self, driver, download_dir):
# add missing support for chrome "send_command" to selenium webdriver
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': download_dir}}
command_result = driver.execute("send_command", params)
print("response from browser:")
for key in command_result:
print("result:" + key + ":" + str(command_result[key]))
from os import path, remove
from time import sleep
from pprint import pprint as pp
from driver_builder import DriverBuilder
class TestDownload:
def __init__(self):
pass
def test_download(self, url):
driver_builder = DriverBuilder()
download_path = r"C:\Users\H30801\mygit"
expected_download = path.join(download_path, "test.xlsx")
# clean downloaded file
try:
remove(expected_download)
except OSError:
pass
assert (not path.isfile(expected_download))
driver = driver_builder.get_driver(download_path, headless=True)
driver.get(url)
self.wait_until_file_exists(expected_download, 30)
driver.close()
assert (path.isfile(expected_download))
print("done")
def wait_until_file_exists(self, actual_file, wait_time_in_seconds=5):
waits = 0
while not path.isfile(actual_file) and waits < wait_time_in_seconds:
print("sleeping...." + str(waits))
pp(actual_file)
sleep(.5) # make sure file completes downloading
waits += .5
if __name__ == "__main__":
td= TestDownload()
url = 'https://test.com/test.xlsx'
td.test_download(url)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.