[英]Python Thread: can't start new thread
I'm trying to run this code:我正在尝试运行此代码:
def VideoHandler(id):
try:
cursor = conn.cursor()
print "Doing {0}".format(id)
data = urllib2.urlopen("http://myblogfms2.fxp.co.il/video" + str(id) + "/").read()
title = re.search("<span class=\"style5\"><strong>([\\s\\S]+?)</strong></span>", data).group(1)
picture = re.search("#4F9EFF;\"><img src=\"(.+?)\" width=\"120\" height=\"90\"", data).group(1)
link = re.search("flashvars=\"([\\s\\S]+?)\" width=\"612\"", data).group(1)
id = id
print "Done with {0}".format(id)
cursor.execute("insert into videos (`title`, `picture`, `link`, `vid_id`) values('{0}', '{1}', '{2}', {3})".format(title, picture, link, id))
print "Added {0} to the database".format(id)
except:
pass
x = 1
while True:
if x != 945719:
currentX = x
thread.start_new_thread(VideoHandler, (currentX))
else:
break
x += 1
and it says "can't start new thread"它说“无法启动新线程”
The real reason for the error is most likely that you create way too many threads (more than 100k.!!) and hit an OS-level limit.错误的真正原因很可能是您创建了太多线程(超过 100k。!!)并达到了操作系统级别的限制。
Your code can be improved in many ways besides this:除此之外,您的代码还可以通过多种方式进行改进:
thread
module, use the Thread
class in the threading
module.不要使用低级thread
模块,使用threading
模块中的Thread
class。queue.Queue
instance)将您创建的线程数量限制在合理的范围内:处理所有元素,创建少量线程并让每个线程处理整个数据的一个子集(这是我在下面建议的,但您也可以采用生产者-消费者工作线程从queue.Queue
实例获取数据的模式)except: pass
statement in your code.并且永远不会在您的代码中有一个except: pass
语句。 Or if you do, don't come crying here if your code does not work and you cannot figure out why.或者,如果您这样做了,如果您的代码不起作用并且您无法弄清楚原因,请不要来这里哭泣。 :-) :-)Here's a proposal:这是一个建议:
from threading import Thread
import urllib2
import re
def VideoHandler(id_list):
for id in id_list:
try:
cursor = conn.cursor()
print "Doing {0}".format(id)
data = urllib2.urlopen("http://myblogfms2.fxp.co.il/video" + str(id) + "/").read()
title = re.search("<span class=\"style5\"><strong>([\\s\\S]+?)</strong></span>", data).group(1)
picture = re.search("#4F9EFF;\"><img src=\"(.+?)\" width=\"120\" height=\"90\"", data).group(1)
link = re.search("flashvars=\"([\\s\\S]+?)\" width=\"612\"", data).group(1)
id = id
print "Done with {0}".format(id)
cursor.execute("insert into videos (`title`, `picture`, `link`, `vid_id`) values('{0}', '{1}', '{2}', {3})".format(title, picture, link, id))
print "Added {0} to the database".format(id)
except:
import traceback
traceback.print_exc()
conn = get_some_dbapi_connection()
threads = []
nb_threads = 8
max_id = 945718
for i in range(nb_threads):
id_range = range(i*max_id//nb_threads, (i+1)*max_id//nb_threads + 1)
thread = Thread(target=VideoHandler, args=(id_range,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join() # wait for completion
os has a limit of the amount of threads. os 有线程数量的限制。 So you can't create too many threads over the limit.所以你不能创建太多超过限制的线程。 ThreadPool should be a good choice for you the do this high concurrency work. ThreadPool 应该是你做这种高并发工作的好选择。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.