简体   繁体   English

使用Python同时从多个网站下载图像

[英]Downloading images from multiple websites concurrently using Python

I'm trying to download multiple images concurrently using Python over the internet, and I've looked at several option but none of them seem satisfactory. 我正在尝试使用Python在互联网上同时下载多个图像,并且我已经研究了几种选择,但它们似乎都不令人满意。

I've considered pyCurl, but don't really understand the example code, and it seems to be way overkill for a task as simple as this. 我已经考虑过pyCurl,但是并没有真正理解示例代码,对于这样简单的任务来说,这似乎有些过头了。 urlgrabber seems to be a good choice, but the documentation says that the batch download feature is still in development. urlgrabber似乎是一个不错的选择,但文档显示批下载功能仍在开发中。 I can't find anything in the documentation for urllib2. 我在urllib2的文档中找不到任何内容。

Is there an option that actually works and is simpler to implement? 是否有一个实际可行且更易于实现的选项? Thanks. 谢谢。

It's not fancy, but you can use urllib.urlretrieve , and a pool of threads or processes running it. 这并不花哨,但您可以使用urllib.urlretrieve以及运行它的线程或进程池。

Because they're waiting on network IO, you can get multiple threads running concurrently - stick the URLs and destination filenames in a Queue.Queue , and have each thread suck them up. 因为它们正在等待网络IO,所以您可以让多个线程同时运行-将URL和目标文件名粘贴在Queue.Queue ,并让每个线程将它们吸起来。

If you use multiprocessing, it's even easier - just create a Pool of processes, and call mypool.map with the function and iterable of arguments. 如果您使用多重处理,它甚至更容易-只需创建一个进程Pool ,然后使用函数和参数可迭代的名称调用mypool.map There isn't a thread pool in the standard library, but you can get a third party module if you need to avoid launching separate processes. 标准库中没有线程池,但是如果需要避免启动单独的进程,则可以获取第三方模块。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM