如何使用 Numba 为 Python 中的线程释放 GIL？

Question

I want to make a program that consists of 2 parts: one is to receive data and the other is to write it to a file.我想做一个由两部分组成的程序：一个是接收数据，另一个是将其写入文件。 I thought that it would be better if I could use 2 threads(and possibly 2 cpu cores) to do the jobs separately.我认为如果我可以使用 2 个线程（可能还有 2 个 cpu 内核）来分别执行这些工作会更好。 I found this: https://numba.pydata.org/numba-doc/dev/user/jit.html#compilation-options and it allows you to release the GIL.我发现了这个： https : //numba.pydata.org/numba-doc/dev/user/jit.html#compilation-options ，它允许您发布 GIL。 I wonder if it suits my purpose and if I could adopt it for this kind of job.我想知道它是否适合我的目的，我是否可以将它用于这种工作。 This is what I tried:这是我尝试过的：

import threading
import time
import os
import queue
import numba
import numpy as np

condition = threading.Condition()
q_text = queue.Queue()

#@numba.jit(nopython=True, nogil=True)
def consumer():
    t = threading.currentThread()

    with condition:
        while True:
            str_test = q_text.get()
            with open('hello.txt', 'a') as f:
                f.write(str_test)
            condition.wait()            

def sender():
    with condition:
        condition.notifyAll()

def add_q(arr="hi\n"):
    q_text.put(arr)
    sender()

c1 = threading.Thread(name='c1', target=consumer)
c1.start()

add_q()

It works fine without numba , but when I apply it to consumer , it gives me an error:它在没有numba情况下工作正常，但是当我将它应用于consumer ，它给了我一个错误：

Exception in thread c1:
Traceback (most recent call last):
  File "d:\python36-32\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "d:\python36-32\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "d:\python36-32\lib\site-packages\numba\dispatcher.py", line 368, in _compile_for_args
    raise e
  File "d:\python36-32\lib\site-packages\numba\dispatcher.py", line 325, in _compile_for_args
    return self.compile(tuple(argtypes))
  File "d:\python36-32\lib\site-packages\numba\dispatcher.py", line 653, in compile
    cres = self._compiler.compile(args, return_type)
  File "d:\python36-32\lib\site-packages\numba\dispatcher.py", line 83, in compile
    pipeline_class=self.pipeline_class)
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 873, in compile_extra
    return pipeline.compile_extra(func)
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 367, in compile_extra
    return self._compile_bytecode()
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 804, in _compile_bytecode
    return self._compile_core()
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 791, in _compile_core
    res = pm.run(self.status)
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 253, in run
    raise patched_exception
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 245, in run
    stage()
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 381, in stage_analyze_bytecode
    func_ir = translate_stage(self.func_id, self.bc)
  File "d:\python36-32\lib\site-packages\numba\compiler.py", line 937, in translate_stage
    return interp.interpret(bytecode)
  File "d:\python36-32\lib\site-packages\numba\interpreter.py", line 92, in interpret
    self.cfa.run()
  File "d:\python36-32\lib\site-packages\numba\controlflow.py", line 515, in run
    assert not inst.is_jump, inst
AssertionError: Failed at nopython (analyzing bytecode)
SETUP_WITH(arg=60, lineno=17)

There was no error if I exclude condition(threading.Condion) from consumer , so maybe it's because JIT doesn't interpret it?如果我从consumer排除condition(threading.Condion)没有错误，所以也许是因为 JIT 没有解释它？ I'd like to know if I can adopt numba to this kind of purpose and how to fix this problem(if it's possible).我想知道我是否可以将numba用于这种目的以及如何解决这个问题（如果可能的话）。

Answer 1

You can't use the threading module within a Numba function, and opening/writing a file isn't supported either.您不能在 Numba 函数中使用threading模块，也不支持打开/写入文件。 Numba is great when you need computational performance, your example is purely I/O, that's not a usecase for Numba.当您需要计算性能时，Numba 很棒，您的示例纯粹是 I/O，这不是 Numba 的用例。

The only way Numba would add something is if you apply a function on your str_test data. Numba 添加内容的唯一方法是在str_test数据上应用函数。 Compiling that function with nogil=True would allow multi-threading.使用nogil=True编译该函数将允许多线程。 But again, that's only worth it if you that function would be computationally expensive compared to the I/O.但同样，只有当您的功能与 I/O 相比计算成本高时，这才是值得的。

You could look into an async solution, that's more appropriate for I/O bound performance.您可以研究一个异步解决方案，它更适合 I/O 绑定性能。

See this example from the Numba documentation for a case where threading improves performance: https://numba.pydata.org/numba-doc/dev/user/examples.html#multi-threading有关线程提高性能的情况，请参阅 Numba 文档中的此示例： https ://numba.pydata.org/numba-doc/dev/user/examples.html#multi-threading

如何使用 Numba 为 Python 中的线程释放 GIL？

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-01-02 09:09:35

如何使用 Numba 为 Python 中的线程释放 GIL？

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-01-02 09:09:35

解决方案1
2 已采纳 2020-01-02 09:09:35