简体   繁体   English

在python中下载文件时出现问题

[英]Issue while downloading in file in python

I am trying to download a file using requests. 我正在尝试使用请求下载文件。 I am running it on python 3.6.5. 我在python 3.6.5上运行它。 Below is my code: 下面是我的代码:

import requests 
file_url = "http://codex.cs.yale.edu/avi/db-book/db4/slide-dir/ch1-2.pdf"

r = requests.get(file_url, stream = True) 

with open("python.pdf","wb") as pdf: 
    for chunk in r.iter_content(chunk_size=1024): 
        if chunk: 
            pdf.write(chunk)

Getting the below error: 得到以下错误:

ConnectionError: HTTPConnectionPool(host='codex.cs.yale.edu', port=80): Max retries exceeded with url: /avi/db-book/db4/slide-dir/ch1-2.pdf (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000001421CF5080>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed',))

I have tried a number of suggested methods on the same issues like increasing the timeout but it doesn't help. 对于相同的问题,我尝试了一些建议的方法,例如增加超时时间,但这无济于事。 Also, that link is working perfectly fine. 此外,该链接也可以正常工作。

Any idea on whats wrong here? 有什么不对劲的想法吗?

I would suggest using looking into fake user agents eg https://pypi.org/project/fake-useragent/ and using proxy rotation to hit the endpoint you are trying to hit. 我建议使用调查伪造的用户代理(例如https://pypi.org/project/fake-useragent/),并使用代理轮换来攻击您要攻击的端点。 A good example on how to achieve those is https://www.scrapehero.com/how-to-rotate-proxies-and-ip-addresses-using-python-3/ 关于如何实现这些目标的一个很好的示例是https://www.scrapehero.com/how-to-rotate-proxies-and-ip-addresses-using-python-3/

The problem was in the remote terminal. 问题出在远程终端上。 Somehow the remote terminal won't execute the connections and it might throw an error. 远程终端以某种方式将无法执行连接,并且可能会引发错误。 It worked fine on my personal machine. 在我的个人计算机上运行正常。

Thanks all for your suggestions. 谢谢大家的建议。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM