简体   繁体   中英

How to get values from a list of dictionaries, which themselves contain lists of dictionaries in Python

I encountered the problem of getting values from a list that contains dictionaries, where each of the dictionaries has a list with a dictionary. May sound easy to do but it took me some time and I think it can be useful for other people if I post it. An example of my data can be:

 player_info = [{[{'tag': 'tag 1'}]}, {[{'tag': 'tag 2'}]}]

The outer list is called 'player_info'. This contained 25 dictionaries, where each one contains a list that contains (among other things) a dictionary called 'opponent' which contains a list that contains a dictionary (yeah, pretty messy). From that innermost dictionary, I wanted the value associated with the 'tag' key.

I figured two ways:

  1. Create a loop.
 for i in range(25): print(player_info[i]['opponent'][0]['tag'])
  1. Iterate through list:
 {each_dictionary['opponent'][0]['tag'] for each_dictionary in player_info}

I assume that the second way must be more efficient. Let me know what you think, and whether there is a smarter way to do it.

First: dict 's require a key-value association for every element in the dictionary. Your 2nd level data structure though does not include keys: ( {[{'tag': 'tag 1'}]} ) This is a set . Unlike dict 's, set 's do not have keys associated with their elements. So your data structure looks like List[Set[List[Dict[str, str]]]] .

Second: when I try to run

# python 3.8.8
player_info = [{[{'tag': 'tag 1'}]},
               {[{'tag': 'tag 2'}]}]

I recieve the error TypeError: unhashable type: 'list' . That's because you're code attempts to contain a list inside a set. Set membership in python demands the members to be hashable. However, you will not find a __hash__() function defined on list objects. Even if you resolve this by replacing the list with a tuple , you will find that dict objects are not hashable either. Potential solutions include using immutable objects like frozendict or tuple , but that is another post.

To answer your question, I have reformulated your problem as

player_info = [[[{'tag': 'tag 1'}]],
               [[{'tag': 'tag 2'}]]]

and compared the performance difference with A) explicit loops:

for i in range(len(player_info)):
  print(player_info[i][0][0]['tag'])

against B) list comprehension

[
  print(single_player_info[0][0]['tag']) 
  for single_player_info in player_info
]

Running the above code blocks in jupyter with the %%timeit cell magic, I got: A) 154 µs ± 14.6 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each) and B) 120 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Note: This experiment is highly skewed for at least two reasons:

  1. I tested both trials using only the data you provided (N=2). It is very likely that we would observe different scaling behaviors than initial conditions suggest.
  2. print consumes a lot of time and makes this problem heavily subject to the status of the kernel

I hope this answers your question.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM