简体   繁体   中英

is these two dictonary statments are same while looping it in for loop?

I read that my_dict.keys() returns the dynamic view to iterate over the dictionary. I usually iterate the dictionary without the keys() function.

So my question is, are below two code blokes are same? if not, what performance differences do these two have (which one is more optimized)

# without keys() function
my_dict = {'key1' : 'value1',"key2" : "value2"}

for key in my_dict:
  print("current key is",key)
# withkeys() function
my_dict = {'key1' : 'value1',"key2" : "value2"}

for key in my_dict.keys():
  print("current key is",key)

Note: I usually use python version 3.7+ so if there's any version-wise implementation difference, kindly attach resources that I can refer to.

The two code blocks are the same, the only difference is that calling .keys() creates a temporary keys view object that is thrown away almost immediately (when the implicit call to iter that begins the for loop creates a dict_keyiterator object that references the underlying dict ). The cost of creating a keys view is pretty small, so for large dict s, the cost is irrelevant.

Proving they're the same (run on 3.10.8):

>>> print(type(iter({})))
<class 'dict_keyiterator'>

>>> print(type(iter({}.keys())))
<class 'dict_keyiterator'>

On Python 2, there was a big difference ( .keys() would eagerly shallow copy the keys to a new list ), but in Python 3, .keys() is just wasting a tiny fixed amount of work making the keys view up front, and otherwise the rest of the loop behaves & performs identically .

It sounds like for key in my_dict is slightly faster... try:

from time import time

my_dict = {'key1' : 'value1',"key2" : "value2"}

start = time()
for _ in range(1000000):
    for key in my_dict:
      a = key
print(time() - start)


my_dict = {'key1' : 'value1',"key2" : "value2"}

start = time()
for _ in range(1000000):
    for key in my_dict.keys():
      a = key
print(time() - start)

# 0.28826069831848145
# 0.3530569076538086

Which is what I'd expect because for key in my_dict.keys(): involves one more method call.


Edit: it appears I've used a crude method ( time.time() ) for the measurements as @ShadowRanger pointed out. See his insightful answer and comments. Also, apologies for referring to an outdated documentation.

A better approach would be using timeit which would show the results in both cases are almost identical as pointed out in @ShadowRanger's answer, confirming other observations in the comments.

from timeit import repeat

repeat(setup="my_dict = {i : 'value' for i in range(1000)}", stmt="""
for _ in range(100):
    for key in my_dict.keys():
      a = key
""", number=1000
)

repeat(setup="my_dict = {i : 'value' for i in range(1000)}", stmt="""
for _ in range(100):
    for key in my_dict:
      a = key
""", number=1000
)

"""
[2.141656800000419,
 2.098019299999578,
 2.0064776999997775,
 1.9754404000004797,
 2.020252399999663]

[1.9935685999998896,
 2.021206500000517,
 2.028992300000027,
 2.026352799999586,
 2.0209632999994938]
"""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM