简体   繁体   English

有没有办法使用纯 python 为纯函数释放 GIL?

[英]Is there a way to release the GIL for pure functions using pure python?

I think I must be missing something;我想我一定错过了什么; this seems so right, but I can't see a way to do this.这看起来很正确,但我看不到这样做的方法。

Say you have a pure function in Python:假设您在 Python 中有一个纯 function:

from math import sin, cos

def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
    return (x, y)

is there some built-in functionality or library that provides a wrapper of some sort that can release the GIL during the function's execution?是否有一些内置功能或库提供某种包装器,可以在函数执行期间释放 GIL?

In my mind I am thinking of something along the lines of在我的脑海中,我在想一些类似的事情

from math import sin, cos
from somelib import pure

@pure
def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
    return (x, y)

Why do I think this might be useful?为什么我认为这可能有用?

Because multithreading, which is currently only attractive for I/O-bound programs, would become attractive for such functions once they become long-running.因为目前只对 I/O 密集型程序有吸引力的多线程,一旦这些功能长时间运行,就会对它们有吸引力。 Doing something like做类似的事情

from math import sin, cos
from somelib import pure
from asyncio import run, gather, create_task

@pure  # releases GIL for f
async def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2 * t) - 2 * cos(3 * t) - cos(4 * t)
    return (x, y)


async def main():
    step_size = 0.1
    result = await gather(*[create_task(f(t / step_size))
                            for t in range(0, round(10 / step_size))])
    return result

if __name__ == "__main__":
    results = run(main())
    print(results)

Of course, multiprocessing offers Pool.map which can do something very similar.当然, multiprocessing提供Pool.map可以做非常相似的事情。 However, if the function returns a non-primitive / complex type then the worker has to serialize it and the main process HAS to deserialize and create a new object, creating a necessary copy.但是,如果 function 返回非原始/复杂类型,则工作人员必须对其进行序列化,并且主进程必须反序列化并创建新的 object,从而创建必要的副本。 With threads, the child thread passes a pointer and the main thread simply takes ownership of the object.对于线程,子线程传递一个指针,主线程简单地获得 object 的所有权。 Much faster (and cleaner?).更快(更清洁?)。

To tie this to a practical problem I encountered a few weeks ago: I was doing a reinforcement learning project, which involved building an AI for a chess-like game.为了将此与我几周前遇到的一个实际问题联系起来:我正在做一个强化学习项目,其中涉及为类似国际象棋的游戏构建人工智能。 For this, I was simulating the AI playing against itself for > 100,000 games;为此,我模拟了 AI 与自己对战> 100,000场比赛; each time returning the resulting sequence of board states (a numpy array).每次返回板状态的结果序列( numpy数组)。 Generating these games runs in a loop, and I use this data to create a stronger version of the AI each time.生成这些游戏循环运行,我每次都使用这些数据来创建更强大的 AI 版本。 Here, re-creating (" malloc ") the sequence of states for each game in the main process was the bottleneck.在这里,在主进程中为每个游戏重新创建(“ malloc ”)状态序列是瓶颈。 I experimented with re-using existing objects, which is a bad idea for many reasons, but that didn't yield much improvement.我尝试重用现有对象,由于许多原因,这是一个坏主意,但这并没有产生太大的改进。

Edit: This question differs from How to run functions in parallel?编辑:这个问题与如何并行运行函数不同? , because I am not just looking for any way to run code in parallel (I know this can be achieved in various ways, eg via multiprocessing ). ,因为我不只是在寻找并行运行代码的任何方法(我知道这可以通过多种方式实现,例如通过multiprocessing )。 I am looking for a way to let the interpreter know that nothing bad will happen when this function gets executed in a parallel thread.我正在寻找一种方法让解释器知道当这个 function 在并行线程中执行时不会发生任何不好的事情。

Is there a way to release the GIL for pure functions using pure python?有没有办法使用纯 python 为纯函数释放 GIL?

In short, the answer is no , because those functions aren't pure on the level on which the GIL operates.简而言之,答案是否定的,因为这些功能在 GIL 运行的层面上并不是纯粹的。

GIL serves not just to protect objects from being updated concurrently by Python code, its primary purpose is to prevent the interpreter from performing a data race (which is undefined behavior , ie forbidden in the C memory model) while accessing and updating global and shared data. GIL serves not just to protect objects from being updated concurrently by Python code, its primary purpose is to prevent the interpreter from performing a data race (which is undefined behavior , ie forbidden in the C memory model) while accessing and updating global and shared data . This includes Python-visible singletons such as None , True , and False , but also all globals like modules, shared dicts, and caches.这包括 Python 可见的单例,例如NoneTrueFalse ,还包括所有全局变量,例如模块、共享字典和缓存。 Then there is their metadata such as reference counts and type objects, as well as shared data used internally by the implementation.然后是它们的元数据,例如引用计数和类型对象,以及实现内部使用的共享数据。

Consider the provided pure function:考虑提供的纯 function:

def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
    return (x, y)

The dis tool reveals the operations that the interpreter performs when executing the function: dis工具揭示了解释器在执行 function 时执行的操作:

>>> dis.dis(f)
  2           0 LOAD_CONST               1 (16)
              2 LOAD_GLOBAL              0 (sin)
              4 LOAD_FAST                0 (t)
              6 CALL_FUNCTION            1
              8 LOAD_CONST               2 (3)
             10 BINARY_POWER
             12 BINARY_MULTIPLY
             14 STORE_FAST               1 (x)
             ...

To run the code, the interpreter must access the global symbols sin and cos in order to call them.要运行代码,解释器必须访问全局符号sincos才能调用它们。 It accesses the integers 2, 3, 4, 5, 13, and 16, which are all cached and therefore also global.它访问整数 2、3、4、5、13 和 16,它们都是缓存的,因此也是全局的。 In case of an error, it looks up the exception classes in order to instantiate the appropriate exceptions.如果发生错误,它会查找异常类以实例化适当的异常。 Even when these global accesses don't modify the objects, they still involve writes because they must update the reference counts .即使这些全局访问不修改对象,它们仍然涉及写入,因为它们必须更新引用计数

None of that can be done safely from multiple threads without synchronization.在没有同步的情况下,这些都不能从多个线程安全地完成。 While it is conceivably possible to modify the Python interpreter to implement truly pure functions that don't access global state, it would require significant modifications to the internals, affecting compatibility with existing C extensions, including the vastly popular scientific ones.虽然可以修改 Python 解释器以实现不访问全局 state 的真正纯函数,但它需要对内部进行重大修改,从而影响与现有 Z0D61F8370CAD14D412F80ZB84D1 的兼容性。 This last point is the principal reason why removing the GIL has proven to be so difficult.最后一点是事实证明移除 GIL 如此困难的主要原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM