简体   繁体   English

使用 requests 模块从 aspx 站点下载 zip 文件

[英]Using requests module to download zip file from aspx site

I'm trying to download a zip file using the requests module.我正在尝试使用请求模块下载 zip 文件。 If I run this code, it creates a zip file on my machine, but it is just HTML of an error page.如果我运行此代码,它会在我的机器上创建一个 zip 文件,但它只是错误页面的 HTML 文件。 If I enter the URL into a browser, it correctly downloads the zipped file.如果我在浏览器中输入 URL,它会正确下载压缩文件。

import requests
zipurl = "https://www.dallascad.org/ViewPDFs.aspx?type=3&id=\\DCAD.ORG\WEB\WEBDATA\WEBFORMS\data%20products\DCAD2021_CURRENT.zip"
zname =  "DCAD2021_CURRENT.zip"
resp = requests.get(zipurl)
zfile = open(zname, 'wb')
zfile.write(resp.content)
zfile.close()  

TLDR: The zipurl you provided which works in the browser works because the browser is encoding and escaping some characters. zipurl :您提供的在浏览器中有效的 zipurl 有效,因为浏览器正在编码和 escaping 一些字符。 The correct url is instead as follows:正确的 url 改为如下:

import requests

params = {
    'type': '3',
    'id': '//DCAD.ORG/WEB/WEBDATA/WEBFORMS/data products/DCAD2021_CURRENT.zip',
}

response = requests.get('https://www.dallascad.org/ViewPDFs.aspx', params=params) 

Determined this to be the case by:通过以下方式确定是这种情况:

Navigating to the zipurl in browser with the inspect network tab open, I copied the request as a curl.在打开检查网络选项卡的情况下导航到浏览器中的zipurl ,我将请求复制为 curl。 Then I copied this curl into https://curl.trillworks.com/ , and saw if the python request would work.然后我将此 curl 复制到https://curl.trillworks.com/中,并查看 python 请求是否有效。 It did.它做了。 Then I removed the headers and verified it still worked.然后我删除了标题并验证它仍然有效。 So then I compared the two different url's and saw some differences in encoding/slashing.因此,我比较了两个不同的 url,并看到了编码/斜线的一些差异。

requests.utils.unquote(response.url)
'https://www.dallascad.org/ViewPDFs.aspx?type=3&id=//DCAD.ORG/WEB/WEBDATA/WEBFORMS/data+products/DCAD2021_CURRENT.zip'

vs.对比

requests.utils.unquote(zipurl)
'https://www.dallascad.org/ViewPDFs.aspx?type=3&id=\\DCAD.ORG\\WEB\\WEBDATA\\WEBFORMS\\data+products\\DCAD2021_CURRENT.zip'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM