简体   繁体   English

Python和多线程

[英]Python and multithreading

The python incref is define like this python incref是这样定义的

#define Py_INCREF(op) (                         \
    _Py_INC_REFTOTAL  _Py_REF_DEBUG_COMMA       \
    ((PyObject *)(op))->ob_refcnt++)

With multi-core, the incrementation is only is L1 cache and not flushed to memory. 对于多核,增量仅是L1高速缓存,不会刷新到内存。

If two thread increment the refcnt at the same time, in differents core, without a flush to the real memory, for me, it's possible to lost one incrementation. 如果两个线程同时在不同的内核中增加该引用,而没有刷新到实际内存,那么对我来说,可能会丢失一个增量。 - ob_refcnt=1 - Core 1 increment, but not flush => ob_refcnt=2 in L1 cache of core 1 - Core 2 increment, but not flush => ob_refcnt=2 in L1 cache of core 2 - WTF -ob_refcnt = 1-核心1的L1高速缓存中的核心1增量,但不刷新=> ob_refcnt = 2-核心2的L1高速缓存中的核心2增量,但不刷新=> ob_refcnt = 2-WTF

Is it a risk to use multi-core or multi-process ? 使用多核或多进程有风险吗?

The PyObject was declared like this: PyObject的声明如下:

typedef struct _object {
    _PyObject_HEAD_EXTRA
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject

But Py_ssize_t is just a ssize_t or intptr_t. 但是Py_ssize_t只是ssize_t或intptr_t。

The _Py_atomic* functions and attributes do not seem to be used. 似乎未使用_Py_atomic *函数和属性。

How Python can manage this scenario ? Python如何管理这种情况? How can it flush the cache between threads ? 如何刷新线程之间的缓存?

The CPython implementation of Python has the global interpreter lock (GIL) . Python的CPython实现具有全局解释器锁(GIL) It is undefined behaviour to call the vast majority of Python C API functions (including Py_INCREF ) without holding this lock and will almost certainly result in inconsistent data or your program crashing. 调用大多数Python C API函数(包括Py_INCREF )而不持有此锁是未定义的行为,几乎可以肯定会导致数据不一致或程序崩溃。

The GIL can be released and acquired as described in the documentation . 可以按照文档中的说明释放和获取 GIL。

Because of the need to hold this lock in order to operate on Python objects multithreading in Python is pretty limited, and the only operations that parallelize well are things like waiting for IO or pure C calculations on large arrays. 由于需要持有此锁才能在Python对象上进行操作,因此Python中的多线程非常有限,并且唯一能够并行化的操作是诸如在大型数组上等待IO或纯C计算之类的事情。 The multiprocessing module (that starts isolated Python processes) is another option for parallel Python. multiprocessing模块(启动隔离的Python进程)是并行Python的另一个选项。


There have been attempts to use atomic types for reference counting (to remove/minimize the need for the GIL) but these caused significant slowdowns in single-threaded code so were abandoned. 已经尝试使用原子类型进行引用计数 (以消除/最小化对GIL的需求),但是这些操作导致单线程代码的显着减慢,因此被放弃了。

Why not use Lock's or Semaphore's of Python ? 为什么不使用Python的Lock或Semaphore? https://docs.python.org/2/library/threading.html https://docs.python.org/2/library/threading.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM