简体   繁体   English

如何在python中从网络下载xlsx文件

[英]How to download a xlsx file from web in python

I am trying to download an excel file from this website.我想从这个网站下载一个excel文件。 but unfortunately my code unable to download the excel file.但不幸的是我的代码无法下载excel文件。 There is a download button, somehow I have click that button from python.有一个下载按钮,不知何故我从python点击了那个按钮。 Please check my code:请检查我的代码:

 import requests
 from bs4 import BeautifulSoup as BS
 from selenium import webdriver
 from fake_useragent import UserAgent

 headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.96 Safari/537.36'}

 driver = webdriver.Chrome('chromedriver_win32\chromedriver')


 page = 'https://data.world/makeovermonday/2019w16'

 driver.get(page)


 inputElement = driver.find_element_by_id("fileactions.files.download")
 #inputElement.clear()
 #inputElement.send_keys(company)
 inputElement.submit()

The simplest way forward here would be to use the Python SDK .这里最简单的方法是使用Python SDK

Alternatively, you could use requests and download the dataset with an API call.或者,您可以使用requests并通过 API 调用下载数据集。 Take a look at these endpoints:看看这些端点:

https://apidocs.data.world/toolkit/api/api-endpoints/datasets/downloaddataset https://apidocs.data.world/toolkit/api/api-endpoints/files/downloadfile https://apidocs.data.world/toolkit/api/api-endpoints/datasets/downloaddataset https://apidocs.data.world/toolkit/api/api-endpoints/files/downloadfile

An example of the former:前者的一个例子:

url = 'https://api.data.world/v0/download/makeovermonday/2019w16'
headers = {'Authorization': 'Bearer my-token-from-https://data.world/integrations/python'}
r = requests.get(url, headers=headers)
with open('dataset.zip', 'wb') as f:
    f.write(r.content)

This seemed to work ok for me:这对我来说似乎没问题:

import requests
from bs4 import BeautifulSoup as BS
from selenium import webdriver
from fake_useragent import UserAgent
import time

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.96 Safari/537.36'}

driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')

page = 'https://data.world/makeovermonday/2019w16'

driver.get(page)
driver.execute_script("window.scrollTo(0, 400)") 

dropdownElement = driver.find_element_by_id("fileactions.files.download")
dropdownElement.click()

downloadElement = driver.find_element_by_xpath("/html/body/div[3]/div/ul/li/a/div[2]/div")
downloadElement.click()

You are expecting the download to start after you click on the icon button but it only shows a pop-up panel that has the actual download button.您希望在单击图标按钮后开始下载,但它仅显示具有实际下载按钮的弹出面板。 To start the download you have to click on that button.要开始下载,您必须单击该按钮。

First, submit() only works on forms.首先, submit()仅适用于表单。 In the given page, the download button can't be submitted.在给定的页面中,无法提交下载按钮。 You have to use click() .您必须使用click()

Second, After you click on the first button it shows a pop-up modal with the download link.其次,单击第一个按钮后,它会显示一个带有下载链接的弹出式模式。 You have to click on that button to actually start the download.您必须单击该按钮才能真正开始下载。 You have to click the following element to start the download.您必须单击以下元素才能开始下载。

driver.find_element_by_css_selector("div.open > ul > li > a");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM