简体   繁体   English

从 Python 字典中提取键和值(以线程安全的方式)

[英]Extracting keys and values from a dictionary in Python (in a thread-safe manner)

I have a simple function for extracting keys and values from a dictionary.我有一个从字典中提取键和值的简单函数。

def separate_kv_fast(adict):
    '''Separates keys/values from a dictionary to corresponding arrays'''
    return adict.keys(), adict.values() 

I know the order is guaranteed if the dictionary "adict" is not modified between the .keys() and .values() call.我知道如果在 .keys() 和 .values() 调用之间未修改字典“adict”,则可以保证顺序。 What I am wondering is if the return statement guarantees this;我想知道的是 return 语句是否能保证这一点; basically, is it going to be thread safe?基本上,它会是线程安全的吗?

Is the following construction of "adict" any safer for multi-threading or not-needed?以下“adict”的构造对于多线程或不需要的情况是否更安全?

def separate_kv_fast(adict):
    '''Separates keys/values from a dictionary to corresponding arrays'''
    bdict = dict(adict)
    return bdict.keys(), bdict.values() 

I've been working on learning python disassembly, and I believe this shows the two calls are not atomic:我一直在学习 python 反汇编,我相信这表明这两个调用不是原子的:

>>> dis.dis(separate_kv_fast)                                                                             
  2           0 LOAD_FAST                0 (adict)
              3 LOAD_ATTR                0 (keys)
              6 CALL_FUNCTION            0
              9 LOAD_FAST                0 (adict)
             12 LOAD_ATTR                1 (values)
             15 CALL_FUNCTION            0
             18 BUILD_TUPLE              2
             21 RETURN_VALUE        
>>> 

That it calls keys and values across multiple opcodes I believe demonstrates it is not atomic.它跨多个操作码调用键和值,我相信这表明它不是原子的。

Let's see how your bdict = dict(adict) works out:让我们看看你的bdict = dict(adict)工作的:

  2           0 LOAD_GLOBAL              0 (dict)
              3 LOAD_FAST                0 (adict)
              6 CALL_FUNCTION            1
              9 STORE_FAST               1 (bdict)

LOAD_FAST pushes a reference to adict onto the stack. LOAD_FASTLOAD_FAST的引用adict入堆栈。 We then call dict with that argument.然后我们用那个参数调用dict What we don't know is if dict() function is atomic.我们不知道的是dict()函数是否是原子的。

bdict = adict.copy() gives a similar disassembly. bdict = adict.copy()给出了类似的反汇编。 adict.copy can't be disassembled. adict.copy不能反汇编。

Everything I read says that internal types are thread safe.我读到的所有内容都说内部类型是线程安全的。 So I believe a single function call into a dictionary would be internally consistent.所以我相信对字典的单个函数调用在内部是一致的。 ie, items() , copy() , values() , keys() , etc. Two calls in serial ( values() followed by keys() aren't necessarilly safe. Neither are iterators.即, items()copy()values()keys()等。连续两次调用( values()后跟keys()不一定安全。迭代器也不是。

Is there a reason your not just using items() ?有什么理由不只是使用items()吗?

I was curious, so went ahead and benchmarked:我很好奇,所以继续进行基准测试:

#!/usr/bin/python
import timeit
import random

D = dict()
for x in xrange(0, 1000):
    D[x] = str(x)

def a():
    return D.keys(), D.values()

def b():
    keys = []
    values = []
    for k, v in D.items():
        keys.append(k)
        values.append(v)
    return keys, values

def c():
    d = D.copy()
    return d.keys(), d.values()

def d():
    return zip(*D.items())

print timeit.timeit("a()", 'from __main__ import a')
print timeit.timeit("b()", 'from __main__ import b')
print timeit.timeit("c()", 'from __main__ import c')
print timeit.timeit("d()", 'from __main__ import d')

Results:结果:

6.56165385246
145.151810169
19.9027020931
65.4051799774

The copy is the fasted atomic one (and might be slightly faster than using dict()).副本是禁食的原子副本(可能比使用 dict() 稍微快一点)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM