简体   繁体   English

有没有办法在不使用 python GIL 的情况下进行序列化/反序列化

[英]Is there a way to serialize/deserialize without engaging the python GIL

A quick test shows that cPickle (python 3.6.9 import pickle defaults to using cPickle ) engages the GIL.快速测试表明 cPickle(python 3.6.9 import pickle 默认使用 cPickle )参与 GIL。

import pickle
import os

big_data = os.urandom(10000000)

def run():
    pickle.loads(pickle.dumps(big_data))

t = timeit.Timer(run)
[threading.Thread(target=lambda: t.timeit(number=2000)).start() for _ in range(4)]

That test of 4 threads running serialization operations runs at 100% cpu, eg it engages the GIL.运行序列化操作的 4 个线程的测试在 100% cpu 上运行,例如它使用 GIL。 The same type of test running a numpy operation uses 400% cpu (no GIL engaged with numpy).运行 numpy 操作的相同类型的测试使用 400% cpu(没有 GIL 与 numpy 接合)。

I was hoping cPickle, being a C function, wouldn't engage the GIL.我希望作为 C function 的 cPickle 不会参与 GIL。 Is there any way around this?有没有办法解决? I'd like to be able to deserialize a large amount of data without blocking the main process.我希望能够在不阻塞主进程的情况下反序列化大量数据。

I am trying to pull in upward of 3GB of data per second from worker processes back to main.我试图将每秒 3GB 以上的数据从工作进程拉回主进程。 I can move the data with streaming sockets and asyncio at 4GB/sec, but the deserialization is a bottleneck.我可以使用流 sockets 和 asyncio 以 4GB/秒的速度移动数据,但反序列化是一个瓶颈。 I don't have the luxury of Python 3.8 and SharedMemory yet unfortunately.不幸的是,我还没有 Python 3.8 和 SharedMemory 的奢侈

An acceptable answer is, of course, a confident No.当然,一个可以接受的答案是肯定的“否”。

Taking @juanpa.arrivillaga's answer from comments to close this question:从评论中获取@juanpa.arrivillaga 的回答来结束这个问题:

I don't see why the fact that the the module is a C-extension should make you think that it wouldn't engage the GIL.我不明白为什么模块是 C 扩展的事实会让你认为它不会参与 GIL。 From my understanding, the fundamental problem the GIL solves is thread-safe access to Python interpreter level objects which rely on reference counting for garbage collection.据我了解,GIL 解决的基本问题是对 Python 解释器级对象的线程安全访问,这些对象依赖于垃圾收集的引用计数。 Since pickle serialization/deserialization touches Python objects that other threads might have access to, it has to engage the GIL.由于 pickle 序列化/反序列化涉及其他线程可能有权访问的 Python 对象,因此它必须使用 GIL。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Python 中序列化/反序列化类对象的最佳方法是什么? - What is the best way to Serialize/Deserialize Class Objects in Python? 在 Python 2.7 中序列化和反序列化对象列表的最快方法是什么? - What's the fastest way to serialize and deserialize a list of objects in Python 2.7? 没有 GIL 就不允许来自 Python 的强制 - Coercion from Python not allowed without the GIL Python:在没有GIL的情况下绘制一些数据(matplotlib) - Python: Plot some data (matplotlib) without GIL 如何反序列化/序列化字节数组以在python中结构化? - How to deserialize/serialize byte array to structure in python? 在 python 中安排定期的非阻塞任务而不处理 GIL - Schedule periodic non-blocking tasks in python without dealing with the GIL 有没有简单的方法来判断等待Python GIL花了多少时间? - Is there an easy way to tell how much time is spent waiting for the Python GIL? 有没有办法使用纯 python 为纯函数释放 GIL? - Is there a way to release the GIL for pure functions using pure python? 使用json和python序列化/反序列化对象列表 - Serialize/Deserialize list of objects using json and python 如何在Python中反序列化和序列化AMF数据包? - How to deserialize and serialize AMF Packets in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM