So, instead of trying to explain things first, I will just show you what I have and what I want (this is easier):
What I have:
dict_list = [
{'some': 1.2, 'key': 1.3, 'words': 3.9, 'label': 0},
{'other': 1.2, 'wordly': 1.3, 'words': 3.9, 'label': 1},
{'other': 10, 'work': 1.3, 'like': 3.9, 'label': 1},
]
What I want to get from what I have:
dict_dict = { "0":{'some': 1.2, 'key': 1.3, 'words': 3.9},
"1":{'other': 10, 'wordly': 1.3, 'work': 1.3, 'like': 3.9, 'words': 3.9},
}
Explanation:
So, I want to create a dictionary by using the " label
" keys as the main keys in that new dictionary. I also need to merge dictionaries that have the same label. During this merging, I need to keep the highest value if there is a duplicate key (as the " other
" key in the example).
Why don't I do all of this before I create the original list of dicts?
Because dict_list
is a result of a joblib (multiprocessing) process. Sharing some objects between processes slowing down the multiprocessing. So, instead of sharing, I have decided to run the heavy work on multiple cores and then do the organizing after. I am not sure if this approach will be any helpful but I can't know without testing.
Counter module has nice merging feature a|b
which joins the dictionaries keeping the higher values.
from collections import Counter
dict_dict = {}
for dictionary in dict_list:
label = str(dictionary.pop('label'))
dict_dict[label] = dict_dict.get(label,Counter())|Counter(dictionary)
###If you don't need Counters, just convert back to dictionaries
dict_dict = {i:dict(v) for i,v in dict_dict.items()}
easy pisy:
dict_of_dicts = {i:item for i,item in enumerate(list_of_dicts)}
if u insist on strings in the keys:
dict_of_dicts = {str(i):item for i,item in enumerate(list_of_dicts)}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.