[英]How can I use threadlocal variable with ThreadPoolExecutor?
I want to threads has some local variable, with thread.Thread
it can be done like this elegantly: 我想线程具有一些局部变量,使用thread.Thread
可以像这样优雅地完成:
class TTT(threading.Thread):
def __init__(self, lines, ip, port):
threading.Thread.__init__(self)
self._lines = lines;
self._sock = initsock(ip, port)
self._sts = 0
self._cts = 0
def run(self):
for line in self._lines:
query = genquery(line)
length = len(query)
head = "0xFFFFFFFE"
q = struct.pack('II%ds'%len(query), head, length, query)
sock.send(q)
sock.recv(4)
length, = struct.unpack('I', sock.recv(4))
result = ''
remain = length
while remain:
t = sock.recv(remain)
result+=t
remain-=len(t)
print(result)
As you can see that _lines
_sock
_sts
_cts
these variable will be independent in every thread. 如您所见, _lines
_sock
_sts
_cts
这些变量在每个线程中都是独立的。
But with concurrent.future.ThreadPoolExecutor
, it seems that it's not that easy. 但是concurrent.future.ThreadPoolExecutor
似乎并不那么容易。 With ThreadPoolExecutor
, how can I make things elegantly?(no more global variables) 使用ThreadPoolExecutor
,我该如何优雅地制作东西?(没有更多的全局变量)
New Edited 新编辑
class Processor(object):
def __init__(self, host, port):
self._sock = self._init_sock(host, port)
def __call__(self, address, adcode):
self._send_data(address, adcode)
result = self._recv_data()
return json.loads(result)
def main():
args = parse_args()
adcode = {"shenzhen": 440300}[args.city]
if args.output:
fo = open(args.output, "w", encoding="utf-8")
else:
fo = sys.stdout
with open(args.file, encoding=args.encoding) as fi, fo,\
ThreadPoolExecutor(max_workers=args.processes) as executor:
reader = csv.DictReader(fi)
writer = csv.DictWriter(fo, reader.fieldnames + ["crfterm"])
test_set = AddressIter(args.file, args.field, args.encoding)
func = Processor(args.host, args.port)
futures = map(lambda x: executor.submit(func, x, adcode), test_set)
for row, future in zip(reader, as_completed(futures)):
result = future.result()
row["crfterm"] = join_segs_tags(result["segs"], result["tags"])
writer.writerow(row)
Using a layout very similar to what you have now would be the easiest thing. 使用与现在非常相似的布局将是最容易的事情。 Instead of a Thread
, have a normal object, and instead of run
, implement your logic in __call__
: 代替Thread
,拥有一个普通的对象,而不是run
,在__call__
实现您的逻辑:
class TTT:
def __init__(self, lines, ip, port):
self._lines = lines;
self._sock = initsock(ip, port)
self._sts = 0
self._cts = 0
def __call__(self):
...
# do stuff to self
Adding a __call__
method to a class makes it possible to invoke instances as if they were regular functions. 在类中添加__call__
方法可以像调用常规函数一样调用实例。 In fact, normal functions are objects with such a method. 实际上,普通功能就是使用这种方法的对象。 You can now pass a bunch of TTT
instances to either map
or submit
. 现在,您可以将一堆TTT
实例传递给map
或submit
。
Alternatively, you could absorb the initialization into your task function: 或者,您可以将初始化吸收到任务函数中:
def ttt(lines, ip, port):
sock = initsock(ip, port)
sts = cts = 0
...
Now you can call submit
with the correct parameter list or map
with an iterable of values for each parameter. 现在,您可以拨打submit
与正确的参数列表或map
与价值观的每个参数的迭代。
I would prefer the former approach for this example because it opens the port outside the executor. 对于本示例,我更喜欢前一种方法,因为它在执行程序外部打开了端口。 Error reporting in executor tasks can be tricky sometimes, and I would prefer to make the error prone operation of opening a port as transparent as possible. 执行程序任务中的错误报告有时可能很棘手,我更希望使易于出错的打开端口的操作尽可能透明。
EDIT 编辑
Based on your related question, I believe that the real question you are asking is about function-local variables (which are automatically thread-local as well), not being shared between function calls on the same thread. 根据您的相关问题,我认为您要问的真正问题是关于函数局部变量(也自动是线程局部变量)的,而不是在同一线程上的函数调用之间共享的。 However, you can always pass references between function calls. 但是,您始终可以在函数调用之间传递引用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.