如何在python中使用Web URL下载文件？通过浏览器下载有效，但不能通过 python 的请求下载

Question

如果在浏览器（Firefox、Chrome 等）中输入 URL，则会下载该文件。 但是当我尝试使用 python 的请求或urllib库下载相同的文件（使用相同的 URL）时，我没有得到任何响应。

网址：https ://www.nseindia.com/products/content/sec_bhavdata_full.csv （参考页面： https : //www.nseindia.com/products/content/equities/equities/eq_security.htm ）

我试过的：

import requests
eqfile = requests.get('https://www.nseindia.com/products/content/sec_bhavdata_full.csv')

没有回应。 然后尝试了以下

temp = requests.get('https://www.nseindia.com/products/content/equities/equities/eq_security.htm')

再次没有回应。

从这样的 URL（网络服务器）下载文件的最佳方式是什么？

Answer 1

如果我使用类似于真实网络浏览器使用的标头的标头User-Agent ，那么我可以下载它。

import requests

headers = {'User-Agent': 'Mozilla/5.0'}
url = 'https://www.nseindia.com/products/content/sec_bhavdata_full.csv'

r = requests.get(url, headers=headers)
#print(r.content)

with open('sec_bhavdata_full.csv', 'wb') as fh:
    fh.write(r.content)

门户网站经常检查此标头以阻止请求或专门为您的浏览器/设备设置 HTML 格式。 但是requests （和urllib.request ）在此标头中发送"python ..." 。

许多门户网站只需要'User-Agent': 'Mozilla/5.0'来发送内容，但其他门户网站可能需要完整标头User-Agent甚至其他标头，如Referrer 、 Accept 、 Accept-Encoding 、 Accept-Language 。 您可以在页面https://httpbin.org/get上查看浏览器使用的标头

来自真实浏览器

如何在python中使用Web URL下载文件？通过浏览器下载有效，但不能通过 python 的请求下载

问题描述

1 个解决方案

解决方案1
3 已采纳 2019-12-06 01:41:54

如何在python中使用Web URL下载文件？ 通过浏览器下载有效，但不能通过 python 的请求下载

问题描述

1 个解决方案

解决方案1 3 已采纳 2019-12-06 01:41:54

如何在python中使用Web URL下载文件？通过浏览器下载有效，但不能通过 python 的请求下载

解决方案1
3 已采纳 2019-12-06 01:41:54