简体   繁体   English

如何在ThreadPoolExecutor中使用threadlocal变量?

[英]How can I use threadlocal variable with ThreadPoolExecutor?

I want to threads has some local variable, with thread.Thread it can be done like this elegantly: 我想线程具有一些局部变量,使用thread.Thread可以像这样优雅地完成:

class TTT(threading.Thread):
    def __init__(self, lines, ip, port):
        threading.Thread.__init__(self)
        self._lines = lines;
        self._sock = initsock(ip, port)
        self._sts = 0
        self._cts = 0

    def run(self):
        for line in self._lines:
            query = genquery(line)
            length = len(query)
            head = "0xFFFFFFFE"
            q = struct.pack('II%ds'%len(query),  head,  length, query)
            sock.send(q)
            sock.recv(4)
            length,  = struct.unpack('I',  sock.recv(4))
            result = ''
            remain = length
            while remain:
                t = sock.recv(remain)
                result+=t
                remain-=len(t)
            print(result)

As you can see that _lines _sock _sts _cts these variable will be independent in every thread. 如您所见, _lines _sock _sts _cts这些变量在每个线程中都是独立的。

But with concurrent.future.ThreadPoolExecutor , it seems that it's not that easy. 但是concurrent.future.ThreadPoolExecutor似乎并不那么容易。 With ThreadPoolExecutor , how can I make things elegantly?(no more global variables) 使用ThreadPoolExecutor ,我该如何优雅地制作东西?(没有更多的全局变量)


New Edited 新编辑

class Processor(object):
    def __init__(self, host, port):
        self._sock = self._init_sock(host, port)

    def __call__(self, address, adcode):
        self._send_data(address, adcode)
        result = self._recv_data()
        return json.loads(result)

def main():
    args = parse_args()
    adcode = {"shenzhen": 440300}[args.city]

    if args.output:
        fo = open(args.output, "w", encoding="utf-8")
    else:
        fo = sys.stdout
    with open(args.file, encoding=args.encoding) as fi, fo,\
        ThreadPoolExecutor(max_workers=args.processes) as executor:
        reader = csv.DictReader(fi)
        writer = csv.DictWriter(fo, reader.fieldnames + ["crfterm"])
        test_set = AddressIter(args.file, args.field, args.encoding)
        func = Processor(args.host, args.port)
        futures = map(lambda x: executor.submit(func, x, adcode), test_set)
        for row, future in zip(reader, as_completed(futures)):
            result = future.result()
            row["crfterm"] = join_segs_tags(result["segs"], result["tags"])
            writer.writerow(row)

Using a layout very similar to what you have now would be the easiest thing. 使用与现在非常相似的布局将是最容易的事情。 Instead of a Thread , have a normal object, and instead of run , implement your logic in __call__ : 代替Thread ,拥有一个普通的对象,而不是run ,在__call__实现您的逻辑:

class TTT:
    def __init__(self, lines, ip, port):
        self._lines = lines;
        self._sock = initsock(ip, port)
        self._sts = 0
        self._cts = 0

    def __call__(self):
        ...
        # do stuff to self

Adding a __call__ method to a class makes it possible to invoke instances as if they were regular functions. 在类中添加__call__方法可以像调用常规函数一样调用实例。 In fact, normal functions are objects with such a method. 实际上,普通功能就是使用这种方法的对象。 You can now pass a bunch of TTT instances to either map or submit . 现在,您可以将一堆TTT实例传递给mapsubmit

Alternatively, you could absorb the initialization into your task function: 或者,您可以将初始化吸收到任务函数中:

def ttt(lines, ip, port):
    sock = initsock(ip, port)
    sts = cts = 0
    ...

Now you can call submit with the correct parameter list or map with an iterable of values for each parameter. 现在,您可以拨打submit与正确的参数列表或map与价值观的每个参数的迭代。

I would prefer the former approach for this example because it opens the port outside the executor. 对于本示例,我更喜欢前一种方法,因为它在执行程序外部打开了端口。 Error reporting in executor tasks can be tricky sometimes, and I would prefer to make the error prone operation of opening a port as transparent as possible. 执行程序任务中的错误报告有时可能很棘手,我更希望使易于出错的打开端口的操作尽可能透明。

EDIT 编辑

Based on your related question, I believe that the real question you are asking is about function-local variables (which are automatically thread-local as well), not being shared between function calls on the same thread. 根据您的相关问题,我认为您要问的真正问题是关于函数局部变量(也自动是线程局部变量)的,而不是在同一线程上的函数调用之间共享的。 However, you can always pass references between function calls. 但是,您始终可以在函数调用之间传递引用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM