[英]How to create and start any number of threads in python 3
I'm making a script to download images from a web page and I'm trying to make it multithreaded so it's a lot faster. 我正在制作一个脚本来从网页上下载图像,并且试图使其成为多线程,因此速度要快得多。
In the downloading function I had to set two arguments because when I set one (queue) I get this error: 在下载功能中,我必须设置两个参数,因为设置一个(队列)时会出现此错误:
TypeError: downloading() takes 1 positional arguments but 21* were given
** queue has 21 links
Code: 码:
count = 0
queue = {"some urls", , , }
done = set()
path = 'foldername'
def downloading(queue, name):
for imgs in queue:
if imgs not in done:
done.add(imgs)
urllib.request.urlretrieve(imgs, path + '/' + imgs.split('/')[-1])
global count
count += 1
print(str(count) + ' ' + name)
print('Done: ' + imgs.split('/')[-1])
def threads(queue):
print('Start Downloading ...')
th1 = Thread(target=downloading, args=(queue, "Thread 1"))
th1.start()
th2 = Thread(target=downloading, args=(queue, "Thread 2"))
th2.start()
th3 = Thread(target=downloading, args=(queue, "Thread 3"))
th3.start()
th4 = Thread(target=downloading, args=(queue, "Thread 4"))
th4.start()
th5 = Thread(target=downloading, args=(queue, "Thread 5"))
th5.start()
th6 = Thread(target=downloading, args=(queue, "Thread 6"))
th6.start()
th7 = Thread(target=downloading, args=(queue, "Thread 7"))
th7.start()
th8 = Thread(target=downloading, args=(queue, "Thread 8"))
th8.start()
th9 = Thread(target=downloading, args=(queue, "Thread 9"))
th9.start()
th10 = Thread(target=downloading, args=(queue, "Thread 10"))
th10.start()
Use a simple for
loop on the number of threads you want. 对所需的线程数使用简单的for
循环。 Of course you should save them somehow to probably join
them in the end: 当然,您应该以某种方式保存他们,以便最终join
他们:
def threads(queue):
num_of_threads = 10
print('Start Downloading ...')
threads = []
for i in range(1, num_of_threads+1):
th = Thread(target=downloading, args=(queue, "Thread {}".format(i)))
threads.append(th)
th.start()
return threads
As suggested, a pool
will be helpful in this use-case since it will optimize the run to your system: 如建议的那样, pool
将在此用例中有所帮助,因为它将优化对系统的运行:
from multiprocessing.dummy import Pool as ThreadPool
def downloading(img):
urllib.request.urlretrieve(img, path + '/' + img.split('/')[-1])
global count
count += 1
print('Done: ' + img.split('/')[-1])
def threads(queue):
pool = ThreadPool()
pool.map(downloading, queue)
pool.close()
pool.join()
Note that in this way you should change the downloading
function to receive one argument that is a single image. 请注意,通过这种方式,您应该更改downloading
功能以接收一个参数(即单个图像)。 The map
function sends each item of an iterable (second argument) to a function (first argument). map
函数将可迭代项(第二个参数)的每个项目发送到函数(第一个参数)。 This is also why the done
set is not necessary since each image will be processed exactly once. 这也是为什么done
集没有必要的原因,因为每个图像将被精确地处理一次。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.