如何在python 3中创建和启动任意数量的线程

Question

I'm making a script to download images from a web page and I'm trying to make it multithreaded so it's a lot faster. 我正在制作一个脚本来从网页上下载图像，并且试图使其成为多线程，因此速度要快得多。

In the downloading function I had to set two arguments because when I set one (queue) I get this error: 在下载功能中，我必须设置两个参数，因为设置一个（队列）时会出现此错误：

TypeError: downloading() takes 1 positional arguments but 21* were given

** queue has 21 links

Code: 码：

count = 0
queue = {"some urls", , , }
done = set()
path = 'foldername'


def downloading(queue, name):
    for imgs in queue:
        if imgs not in done:
            done.add(imgs)
            urllib.request.urlretrieve(imgs, path + '/' + imgs.split('/')[-1])
            global count
            count += 1
            print(str(count) + ' ' + name)
            print('Done:  ' + imgs.split('/')[-1])


def threads(queue):
    print('Start Downloading ...')
    th1 = Thread(target=downloading, args=(queue, "Thread 1"))
    th1.start()
    th2 = Thread(target=downloading, args=(queue, "Thread 2"))
    th2.start()
    th3 = Thread(target=downloading, args=(queue, "Thread 3"))
    th3.start()
    th4 = Thread(target=downloading, args=(queue, "Thread 4"))
    th4.start()
    th5 = Thread(target=downloading, args=(queue, "Thread 5"))
    th5.start()
    th6 = Thread(target=downloading, args=(queue, "Thread 6"))
    th6.start()
    th7 = Thread(target=downloading, args=(queue, "Thread 7"))
    th7.start()
    th8 = Thread(target=downloading, args=(queue, "Thread 8"))
    th8.start()
    th9 = Thread(target=downloading, args=(queue, "Thread 9"))
    th9.start()
    th10 = Thread(target=downloading, args=(queue, "Thread 10"))
    th10.start()

Answer 1

Use a simple for loop on the number of threads you want. 对所需的线程数使用简单的for循环。 Of course you should save them somehow to probably join them in the end: 当然，您应该以某种方式保存他们，以便最终join他们：

def threads(queue):
    num_of_threads = 10
    print('Start Downloading ...')
    threads = []
    for i in range(1, num_of_threads+1):
        th = Thread(target=downloading, args=(queue, "Thread {}".format(i)))
        threads.append(th)
        th.start()
    return threads

As suggested, a pool will be helpful in this use-case since it will optimize the run to your system: 如建议的那样， pool将在此用例中有所帮助，因为它将优化对系统的运行：

from multiprocessing.dummy import Pool as ThreadPool

def downloading(img):
    urllib.request.urlretrieve(img, path + '/' + img.split('/')[-1])
    global count
    count += 1
    print('Done:  ' + img.split('/')[-1])

def threads(queue):
    pool = ThreadPool()
    pool.map(downloading, queue)
    pool.close()
    pool.join()

Note that in this way you should change the downloading function to receive one argument that is a single image. 请注意，通过这种方式，您应该更改downloading功能以接收一个参数（即单个图像）。 The map function sends each item of an iterable (second argument) to a function (first argument). map函数将可迭代项（第二个参数）的每个项目发送到函数（第一个参数）。 This is also why the done set is not necessary since each image will be processed exactly once. 这也是为什么done集没有必要的原因，因为每个图像将被精确地处理一次。

如何在python 3中创建和启动任意数量的线程

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-07-09 14:10:28

如何在python 3中创建和启动任意数量的线程

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-07-09 14:10:28

解决方案1
0 已采纳 2019-07-09 14:10:28