如何在python中使用Web URL下载文件？通过浏览器下载有效，但不能通过 python 的请求下载

Question

The file gets downloaded if the URL is entered in a browser (Firefox, Chrome, etc.).如果在浏览器（Firefox、Chrome 等）中输入 URL，则会下载该文件。 But when I tried to download the same file (using the same URL) with python's requests or urllib library, I don't get any response.但是当我尝试使用 python 的请求或urllib库下载相同的文件（使用相同的 URL）时，我没有得到任何响应。

URL:https://www.nseindia.com/products/content/sec_bhavdata_full.csv (Reference Page: https://www.nseindia.com/products/content/equities/equities/eq_security.htm )网址：https ://www.nseindia.com/products/content/sec_bhavdata_full.csv （参考页面： https : //www.nseindia.com/products/content/equities/equities/eq_security.htm ）

What I tried:我试过的：

import requests
eqfile = requests.get('https://www.nseindia.com/products/content/sec_bhavdata_full.csv')

got no respnse.没有回应。 Then tried the following然后尝试了以下

temp = requests.get('https://www.nseindia.com/products/content/equities/equities/eq_security.htm')

again no response.再次没有回应。

What would be the optimal way to download a file from such a URL (web server)?从这样的 URL（网络服务器）下载文件的最佳方式是什么？

Answer 1

If I use header User-Agent similar to header used by real web browser then I can download it.如果我使用类似于真实网络浏览器使用的标头的标头User-Agent ，那么我可以下载它。

import requests

headers = {'User-Agent': 'Mozilla/5.0'}
url = 'https://www.nseindia.com/products/content/sec_bhavdata_full.csv'

r = requests.get(url, headers=headers)
#print(r.content)

with open('sec_bhavdata_full.csv', 'wb') as fh:
    fh.write(r.content)

Portals often check this header to block requests or format HTML specially for your browser/device.门户网站经常检查此标头以阻止请求或专门为您的浏览器/设备设置 HTML 格式。 But requests (and urllib.request ) send "python ..." in this header.但是requests （和urllib.request ）在此标头中发送"python ..." 。

Many portals needs only 'User-Agent': 'Mozilla/5.0' to send content but other may need full header User-Agent or even other headers like Referrer , Accept , Accept-Encoding , Accept-Language .许多门户网站只需要'User-Agent': 'Mozilla/5.0'来发送内容，但其他门户网站可能需要完整标头User-Agent甚至其他标头，如Referrer 、 Accept 、 Accept-Encoding 、 Accept-Language 。 You can see headers used by your browser on page https://httpbin.org/get您可以在页面https://httpbin.org/get上查看浏览器使用的标头

from real browser来自真实浏览器

如何在python中使用Web URL下载文件？通过浏览器下载有效，但不能通过 python 的请求下载

问题描述

1 个解决方案

解决方案1
3 已采纳 2019-12-06 01:41:54

如何在python中使用Web URL下载文件？ 通过浏览器下载有效，但不能通过 python 的请求下载

问题描述

1 个解决方案

解决方案1 3 已采纳 2019-12-06 01:41:54

如何在python中使用Web URL下载文件？通过浏览器下载有效，但不能通过 python 的请求下载

解决方案1
3 已采纳 2019-12-06 01:41:54