为什么要下载文件的队列为空

Question

Below is the code that I have that downloads various URLS into each separate thread, I was in attempt to make some changes before I implement the thread pool but with this change the queue is coming to be empty and download is not beginning. 下面是我将不同的URL下载到每个单独的线程中的代码，我试图在实现线程池之前进行一些更改，但是由于此更改，队列将为空并且下载没有开始。

import Queue
import urllib2
import os
import utils as _fdUtils
import signal
import sys
import time
import threading

class ThreadedFetch(threading.Thread):
    """ docstring for ThreadedFetch
    """
    def __init__(self, queue,  out_queue):
        super(ThreadedFetch, self).__init__()
        self.queueItems = queue.get()
        self.__url = self.queueItems[0]
        self.__saveTo = self.queueItems[1]
        self.outQueue = out_queue

    def run(self):
        fileName = self.__url.split('/')[-1]
        path = os.path.join(DESKTOP_PATH, fileName)
        file_size = int(_fdUtils.getUrlSizeInBytes(self.__url))
        while not STOP_REQUEST.isSet():
            urlFh = urllib2.urlopen(self.__url)
            _log.info("Download: %s" , fileName)
            with open(path, 'wb') as fh:
                file_size_dl = 0
                block_sz = 8192
                while True:
                    buffer = urlFh.read(block_sz)
                    if not buffer:
                        break

                    file_size_dl += len(buffer)
                    fh.write(buffer)
                    status = r"%10d  [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)
                    status = status + chr(8)*(len(status)+1)
                    sys.stdout.write('%s\r' % status)
                    time.sleep(.05)
                    sys.stdout.flush()
                    if file_size_dl == file_size:
                        _log.info("Download Completed %s%% for file %s, saved to %s",
                                    file_size_dl * 100. / file_size, fileName, DESKTOP_PATH)

below is the main function that does the call and initiation. 下面是执行调用和初始化的主要功能。

def main(appName):

    args = _fdUtils.getParser()
    urls_saveTo = {}

    # spawn a pool of threads, and pass them queue instance
    # each url will be downloaded concurrently
    for i in range(len(args.urls)):
        t = ThreadedFetch(queue, out_queue)
        t.daemon = True
        t.start()

    try:
        for url in args.urls:
            urls_saveTo[url] = args.saveTo
        # urls_saveTo = {urls[0]: args.saveTo, urls[1]: args.saveTo, urls[2]: args.saveTo}
        # populate queue with data 
        for item, value in urls_saveTo.iteritems():
            queue.put([item, value])

        # wait on the queue until everything has been processed
        queue.join()
        print '*** Done'
    except (KeyboardInterrupt, SystemExit):
        lgr.critical('! Received keyboard interrupt, quitting threads.')

Answer 1

You create the queue and then the first thread which immediately tries to fetch an item from the still empty queue. 创建队列，然后创建第一个线程，该线程立即尝试从仍然为空的队列中获取项目。 The ThreadedFetch.__init__() method isn't run asynchronously, just the run() method when you call start() on a thread object. ThreadedFetch.__init__()方法不是异步运行的，而是在线程对象上调用start()时只是run()方法。

Store the queue in the __init__() and move the get() into the run() method. 将队列存储在__init__() ，并将get()移到run()方法中。 That way you can create all the threads and they are blocking in their own thread, giving you the chance to put items into the queue in the main thread. 这样，您可以创建所有线程，并且它们在自己的线程中阻塞，从而使您有机会将项目放入主线程的队列中。

class ThreadedFetch(threading.Thread):
    def __init__(self, queue, out_queue):
        super(ThreadedFetch, self).__init__()
        self.queue = queue
        self.outQueue = out_queue

    def run(self):
        url, save_to = self.queue.get()
        # ...

For this example the queue is unnecessary by the way as every thread gets exactly one item from the queue. 对于此示例，由于每个线程恰好从队列中获得一项，因此队列是不必要的。 You could pass that item directly to the thread when creating the thread object: 您可以在创建线程对象时将该项目直接传递给线程：

class ThreadedFetch(threading.Thread):
    def __init__(self, url, save_to, out_queue):
        super(ThreadedFetch, self).__init__()
        self.url = url
        self.save_to = save_to
        self.outQueue = out_queue

    def run(self):
        # ...

And when the ThreadedFetch class really just consists of the __init__() and run() method you may consider moving the run() method into a function and start that asynchronously. 并且当ThreadedFetch类实际上仅由__init__()和run()方法组成时，您可以考虑将run()方法移至函数中并以异步方式启动它。

def fetch(url, save_to, out_queue):
    # ...

# ...

def main():
    # ...

    thread = Thread(target=fetch, args=(url, save_to, out_queue))
    thread.daemon = True
    thread.start()

为什么要下载文件的队列为空

问题描述

1 个解决方案

解决方案1
0 已采纳 2014-07-12 10:38:04

为什么要下载文件的队列为空

问题描述

1 个解决方案

解决方案1 0 已采纳 2014-07-12 10:38:04

解决方案1
0 已采纳 2014-07-12 10:38:04