[英]python thread cannot create more than 800
以下是我的代码,对于python来说我真的很新。 从下面的代码中,我实际上将创建多个线程(大于1000)。 但是在某个时刻,将近800个线程,我收到一条错误消息,提示“错误:无法启动新线程”。 我确实读过一些关于线程池的信息。 我真的不明白。 在我的代码中,如何实现线程池? 或者至少请以简单的方式向我解释
#!/usr/bin/python
import threading
import urllib
lock = threading.Lock()
def get_wip_info(query_str):
try:
temp = urllib.urlopen(query_str).read()
except:
temp = 'ERROR'
return temp
def makeURLcall(arg1, arg2, arg3, file_output, dowhat, result) :
url1 = "some URL call with args"
url2 = "some URL call with args"
if dowhat == "IN" :
result = get_wip_info(url1)
elif dowhat == "OUT" :
result = get_wip_info(url2)
lock.acquire()
report = open(file_output, "a")
report.writelines("%s - %s\n"%(serial, result))
report.close()
lock.release()
return
testername = "arg1"
stationcode = "arg2"
dowhat = "OUT"
result = "PASS"
file_source = "sourcefile.txt"
file_output = "resultfile.txt"
readfile = open(file_source, "r")
Data = readfile.readlines()
threads = []
for SNs in Data :
SNs = SNs.strip()
print SNs
thread = threading.Thread(target = makeURLcalls, args = (SNs, args1, testername, file_output, dowhat, result))
thread.start()
threads.append(thread)
for thread in threads :
thread.join()
不要实现自己的线程池,请使用Python附带的线程池。
在Python 3中,你可以使用concurrent.futures.ThreadPoolExecutor
明确地使用线程,关于Python 2.6和更高版本,可以导入Pool
从multiprocessing.dummy
这类似于multiprocessing
API,而是由线程,而不是进程的支持。
当然,如果您需要在CPython(参考解释器)中进行CPU绑定工作,则需要使用适当的multiprocessing
,而不是multiprocessing.dummy
; Python线程适合I / O绑定工作,但是GIL使它们对于CPU绑定工作非常不利。
下面的代码用multiprocessing.dummy
的Pool
替换您显式使用Thread
的代码,使用固定数量的工作线程,每个工作线程尽可能快地完成另一个任务,而不是无限数量的一个工作线程。 首先,由于本地I / O可能相当便宜,并且您想同步输出,因此我们将使worker任务返回结果数据,而不是自己将其写出,并让主线程执行写操作到本地磁盘(不再需要锁定,也不需要一遍又一遍地打开文件)。 将此makeURLcall
更改为:
# Accept args as a single sequence to ease use of imap_unordered,
# and unpack on first line
def makeURLcall(args):
arg1, arg2, arg3, dowhat, result = args
url1 = "some URL call with args"
url2 = "some URL call with args"
if dowhat == "IN" :
result = get_wip_info(url1)
elif dowhat == "OUT" :
result = get_wip_info(url2)
return "%s - %s\n" % (serial, result)
现在,用于替换显式线程使用的代码:
import multiprocessing.dummy as mp
from contextlib import closing
# Open input and output files and create pool
# Odds are that 32 is enough workers to saturate the connection,
# but you can play around; somewhere between 16 and 128 is likely to be the
# sweet spot for network I/O
with open(file_source) as inf,\
open(file_output, 'w') as outf,\
closing(mp.Pool(32)) as pool:
# Define generator that creates tuples of arguments to pass to makeURLcall
# We also read the file in lazily instead of using readlines, to
# start producing results faster
tasks = ((SNs.strip(), args1, testername, dowhat, result) for SNs in inf)
# Pulls and writes results from the workers as they become available
outf.writelines(pool.imap_unordered(makeURLcall, tasks))
# Once we leave the with block, input and output files are closed, and
# pool workers are cleaned up
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.