简体   繁体   English

python dict.items() 线程安全吗?

[英]is python dict.items() threadsafe?

Python raises an exception if a dictionary changes its size during iteration using iteritems() .如果字典在迭代期间使用iteritems()更改其大小,则 Python 会引发异常。

I am hit by this problem since my program is multithreaded and there are cases that I need to iterate over the dict while another thread is adding keys into the dict .我遇到了这个问题,因为我的程序是多线程的,并且在某些情况下我需要迭代dict而另一个线程正在将键添加到dict

Fortunately, I don't need the iteration to be very precise on every element in the dict .幸运的是,我不需要迭代对dict每个元素都非常精确。 Therefore I am thinking to use items() instead of iteritems() to do the iteration.因此我想使用items()而不是iteritems()来进行迭代。 I think items() makes a static snapshot of the dict and I would work around the problem.我认为items()制作了dict的静态快照,我会解决这个问题。

My question is: does items() raises an exception if the dict size is changing at the same time with items() execution?我的问题是:如果dict大小与items()执行同时发生变化,那么items()会引发异常?

thanks谢谢

This answer pertains to CPython 's implementation of Python 2 and Python 3 .此答案与CPythonPython 2Python 3的实现有关。

Since the items() method is implemented purely in C without releasing the GIL beforehand or during its execution, no other thread may acquire it and the underlying data structure remains unchanged while this method is executed.由于items()方法纯粹是在C 中实现的,没有在执行之前或执行期间释放GIL ,因此其他线程无法获取它,并且在执行此方法时底层数据结构保持不变。

Thus, the invocation of the items() method in Python 2 and in Python 3 is guaranteed to succeed.因此,可以保证在Python 2Python 3 中调用items()方法会成功。

However, while the operations performed on the returned object are thread-safe in Python 2 , they are not in Python 3 .然而,虽然对返回对象执行的操作在Python 2 中是线程安全的,但它们在Python 3中不是。

Python 2蟒蛇 2

The returned object is a copy of the dictionary's list of (key, value) pairs. 返回的对象是字典的(key, value)对列表的副本。 As such, it is decoupled from the dictionary and operations performed on the dictionary can't affect the copy in hand.因此,它与字典分离,对字典执行的操作不会影响手头的副本。

Though, it's worth noting that the items themselves aren't copies of the original items so care should be taken when dealing with mutable items.不过,值得注意的是,项目本身不是原始项目的副本,因此在处理可变项目时应小心。

Python 3蟒蛇 3

The returned object is a view object. 返回的对象是一个视图对象。 As such, certain operations performed on it might fail, as detailed in the official documentation :因此,对其执行的某些操作可能会失败,如官方文档中所述

Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.在字典中添加或删除条目时迭代视图可能会引发RuntimeError或无法迭代所有条目。

As the excellent comments noted: 正如出色的评论所述:

  1. This is not thread safe. 这不是线程安全的。

  2. You should really use a lock when doing such things. 在执行此类操作时,您确实应该使用锁。

It is possible to see this in the CPython source code, dictobject.c : CPython源代码dictobject.c可以看到:

If you look at the function 如果你看一下功能

static PyObject *
dict_items(register PyDictObject *mp)

which is used for items , you can see that (after some clever pre-allocation for the results), it basically iterates over the array mp->ma_table (using a mask to see where there are entries). 它用于items ,您可以看到(在对结果进行了一些巧妙的预分配之后),它基本上在mp->ma_table数组上进行迭代(使用掩码查看在哪里有条目)。

Now if you look at the function 现在,如果您看一下功能

static int
dictresize(PyDictObject *mp, Py_ssize_t minused)

which is used when the table needs to be resized, then you can see that ma_table 's elements can be moved into a completely different buffer, and then it can be freed using PYMem_Free . 它在需要调整表大小时使用,然后您可以看到ma_table的元素可以移到完全不同的缓冲区中,然后可以使用PYMem_Free释放PYMem_Free

So, you have a very real risk of accessing garbage memory, if things are done concurrently without synchronization. 因此,如果事情是在没有同步的情况下并发完成的,那么您就有很大的访问垃圾内存的风险。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM