[英]Cap download size with Python requests library
I'm crawling a bunch of web pages using Python's request library, but occasionally the crawler will stumble upon an absolutely mammoth page, be it a PDF or video or otherwise gargantuan file. 我正在使用Python的请求库对一堆网页进行爬网,但有时,爬网程序会偶然发现一个绝对庞大的页面,无论是PDF或视频还是其他庞大的文件。 Is there a good way to limit the maximum size of file it will download? 有没有一种很好的方法来限制它将下载的文件的最大大小?
The urlopen object has a method info() which gives all kinds of useful header information, including Content-Length urlopen对象具有一个info()方法,该方法提供了各种有用的标头信息,包括Content-Length
Occassionally this is not correctly set but should be in most cases and will help 有时,此设置不正确,但在大多数情况下应该正确设置,这将有所帮助
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.