[英]What is the most efficient way to get a list/set of keys from dictionary in python?
In order to quickly compare the keys of 2 dictionaries, I'm creating sets of the keys using this method:为了快速比较 2 个字典的键,我使用这种方法创建了一组键:
dict_1 = {"file_1":10, "file_2":20, "file_3":30, "file_4":40}
dict_2 = {"file_1":10, "file_2":20, "file_3":30}
set_1 = {file for file in dict_1}
set_2 = {file for file in dict_2}
Than I use diff_set = set_1 - set_2
to see which keys are missing from set_2.比我使用diff_set = set_1 - set_2
来查看 set_2 中缺少哪些键。
Is there a faster way?有没有更快的方法? I see that using set(dict.keys())
is less of a workarou, so I'll switch to it - but is it more efficient?我看到使用set(dict.keys())
不是一个工作,所以我会切换到它 - 但它更有效吗?
Let's measure more properly (not just measuring a single execution and also not including the setup) and include faster solutions:让我们更正确地测量(不仅仅是测量单个执行,也不包括设置)并包括更快的解决方案:
300 ns 300 ns 300 ns {*dict_1} - {*dict_2}
388 ns 389 ns 389 ns {file for file in dict_1 if file not in dict_2}
389 ns 390 ns 390 ns dict_1.keys() - dict_2
458 ns 458 ns 458 ns set(dict_1) - set(dict_2)
472 ns 472 ns 472 ns dict_1.keys() - dict_2.keys()
665 ns 665 ns 668 ns set(dict_1.keys()) - set(dict_2.keys())
716 ns 716 ns 716 ns {file for file in dict_1} - {file for file in dict_2}
Benchmark code ( Try it online! ):基准代码( 在线试用! ):
import timeit
setup = '''
dict_1 = {"file_1":10, "file_2":20, "file_3":30, "file_4":40}
dict_2 = {"file_1":10, "file_2":20, "file_3":30}
'''
codes = [
'{file for file in dict_1} - {file for file in dict_2}',
'set(dict_1) - set(dict_2)',
'set(dict_1.keys()) - set(dict_2.keys())',
'dict_1.keys() - dict_2',
'dict_1.keys() - dict_2.keys()',
'{*dict_1} - {*dict_2}',
'{file for file in dict_1 if file not in dict_2}',
]
exec(setup)
for code in codes:
print(eval(code))
tss = [[] for _ in codes]
for _ in range(20):
print()
for code, ts in zip(codes, tss):
number = 10000
t = min(timeit.repeat(code, setup, number=number)) / number
ts.append(t)
for code, ts in sorted(zip(codes, tss), key=lambda cs: sorted(cs[1])):
print(*('%3d ns ' % (t * 1e9) for t in sorted(ts)[:3]), code)
The fastest and most efficient way would be:最快和最有效的方法是:
diff_set = {*dict_1} - {*dict_2}
Output: Output:
{'file_4'}
import timeit
dict_1 = {"file_1":10, "file_2":20, "file_3":30, "file_4":40}
dict_2 = {"file_1":10, "file_2":20, "file_3":30}
def method1():
return {file for file in dict_1} - {file for file in dict_2}
def method2():
return set(dict_1) - set(dict_2)
def method3():
return set(dict_1.keys()) - set(dict_2.keys())
def method4():
return dict_1.keys() - dict_2.keys()
def method5():
return {*dict_1} - {*dict_2}
print(method1())
print(method2())
print(method3())
print(method4())
print(method5())
print(timeit.timeit(stmt = method1, number = 10000)/10000)
print(timeit.timeit(stmt = method2, number = 10000)/10000)
print(timeit.timeit(stmt = method3, number = 10000)/10000)
print(timeit.timeit(stmt = method4, number = 10000)/10000)
print(timeit.timeit(stmt = method5, number = 10000)/10000)
Output: Output:
It took 1.6434900000149355e-06 sec for method 1
It took 8.317999999690073e-07 sec for method 2
It took 1.1994899999990594e-06 sec for method 3
It took 9.747700000389159e-07 sec for method 4
It took 8.049199999732082e-07 sec for method 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.