I am a web developer in Korea. We've recently been using this Python to implement the website crawl feature.
I'm new to Python. We looked for a lot of things for about two days, and we applied them. Current issues include:
I used the following source code.
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
chrome_driver = './browser_driver/chromedriver'
options = webdriver.ChromeOptions()
options.add_argument('--headless')
download_path = r"C:\Users\files"
timeout = 10
driver = webdriver.Chrome(executable_path=chrome_driver, chrome_options=options)
driver.command_executor._commands["send_command"] = (
"POST", '/session/$sessionId/chromium/send_command')
params = {'cmd': 'Page.setDownloadBehavior',
'params': {'behavior': 'allow', 'downloadPath': download_path}}
command_result = driver.execute("send_command", params)
driver.get("site_url")
#download new window
down_xls_btn = driver.find_element_by_id("download")
down_xls_btn.click()
driver.switch_to_window(driver.window_handles[1])
#download start
down_xls_btn = driver.find_element_by_id("download2")
down_xls_btn.click()
The browser itself shuts down as soon as the download is started during testing without headless mode. The headless mode does not download the file itself.
Annotating a DevTools source related to Page.setDownloadBehavior
removes the shutdown but does not change the download path.
I am not good at English, so I translated it into a translator. It's too hard because I'm a beginner. Please help me.
I just tested it with the Firefox web browser. Firefox, unlike Chrome, shows a download window in a new form rather than a new tab, which runs an automatic download and closes the window automatically.
There is a problem here. In fact, the download was successful even in headless mode in the Firefox. However, the driver of the previously defined driver.get() was not recognized when the new window was closed.
import os
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.firefox.options import Options
import json
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir",download_path)
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","application/octet-stream, application/vnd.ms-excel")
fp.set_preference("dom.webnotifications.serviceworker.enabled",False)
fp.set_preference("dom.webnotifications.enabled",False)
timeout = 10
driver = webdriver.Firefox(executable_path=geckodriver, firefox_options=options, firefox_profile=fp)
driver.get(siteurl)
down_btn = driver.find_element_by_xpath('//*[@id="searchform"]/div/div[1]/div[6]/div/a[2]')
down_btn.click()
#down_btn Click to display a new window
#Automatic download starts in new window and closes window automatically
driver.switch_to_window(driver.window_handles[0])
#window_handles Select the main window and output the table to output an error.
print(driver.title)
Perhaps this is the same problem as the one we asked earlier. Since the download is currently successful in the Firefox, we have written code to define a new driver and proceed with postprocessing.
Has anyone solved this problem?
I came across the same issue and I managed to solve it that way:
After you switch to the other window, you should enable the download again:
def enable_download_in_headless_chrome(driver, download_path):
driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')
params = {
'cmd': 'Page.setDownloadBehavior',
'params': {'behavior': 'allow', 'downloadPath': download_path}
}
driver.execute("send_command", params)
Your code will then be:
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
chrome_driver = './browser_driver/chromedriver'
options = webdriver.ChromeOptions()
options.add_argument('--headless')
download_path = r"C:\Users\files"
timeout = 10
driver = webdriver.Chrome(executable_path=chrome_driver, chrome_options=options)
enable_download_in_headless_chrome(driver, download_path)
driver.get("site_url")
#download new window
down_xls_btn = driver.find_element_by_id("download")
down_xls_btn.click()
driver.switch_to_window(driver.window_handles[1])
enable_download_in_headless_chrome(driver, download_path) # THIS IS THE MISSING AND SUPER IMPORTANT PART
#download start
down_xls_btn = driver.find_element_by_id("download2")
down_xls_btn.click()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.