简体   繁体   中英

Cap download size with Python requests library

I'm crawling a bunch of web pages using Python's request library, but occasionally the crawler will stumble upon an absolutely mammoth page, be it a PDF or video or otherwise gargantuan file. Is there a good way to limit the maximum size of file it will download?

The urlopen object has a method info() which gives all kinds of useful header information, including Content-Length

Occassionally this is not correctly set but should be in most cases and will help

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM