简体   繁体   中英

How to remove the duplicates values from only one element of the dictionary at a time?

In this given dictionary defaultdict(dict) type data:

{726: {'X': [3.5, 3.5, 2.0}, 'Y': [2.0, 0.0, 0.0], 'chr': [2, 2, 2]}, 128: {'X': [0.5, 4.0, 4.0], 'Y': [4.0, 3.5, 3.5], 'chr': [3, 3, 3]}}

the numeric value 726 and 128 are the keys and are unique. The other elements are the values tagged with unique identifier and are also unique.

I want to remove the duplicates only from the list values in chr without affecting the data or order of the values in any other parts of the dictionary.

How may I accomplish that?

Thanks,

You can use a nested dict comprehension and convert the list to set in order to get a unique set of items. Since all them items within chr 's value are the same the set will generate 1 item and thus the order doesn't matter in this case. Otherwise you can use OrderedDict.fromkeys() to get a unique set of your items by preserving the order.

In [4]: {k: {k2: set(v2) if k2=='chr' else v2 for k2, v2 in v.items()} for k, v in d.items()}
Out[4]: 
{128: {'Y': [4.0, 3.5, 3.5], 'X': [0.5, 4.0, 4.0], 'chr': {3}},
 726: {'Y': [2.0, 0.0, 0.0], 'X': [3.5, 3.5, 2.0], 'chr': {2}}}

If d is your dictionary, you can simply do :

for k in d: d[k]['chr']=d[k]['chr'][0]

assuming unique value in chr.

If multiple values exists,

for k in d: 
 l=d[k]['chr']+[None]
 d[k]['chr']=[x for (i,x) in enumerate(l[:-1]) if l[i]!=l[i+1]] 

will make the job.

What you should do is iterate through the unique keys and for each unique key choose the 'chr' key and transform its value to a set (which can only have unique values).

for lists in YOUR_DICT.values():
    lists['chr'] = list(set(lists['chr']))
print(YOUR_DICT)
# {'726': {'Y': [2.0, 0.0, 0.0], 'X': [3.5, 3.5, 2.0], 'chr': [2]}, 
#  '128': {'Y': [4.0, 3.5, 3.5], 'X': [0.5, 4.0, 4.0], 'chr': [3]}}

This will preserve the order of the lists:

from collections import OrderedDict
a={726: {'X': [3.5, 3.5, 2.0], 'Y': [2.0, 0.0, 0.0], 'chr': [2, 3, 2, 1, 1, 2, 3 ]}, 128: {'X': [0.5, 4.0, 4.0], 'Y': [4.0, 3.5, 3.5], 'chr': [3, 3,3]}}
b=copy.deepcopy(a)
for key in b:
    a[key]['chr']=list(OrderedDict.fromkeys(b[key]['chr'])) 

The original order of the top level keys in a will be lost the moment when a is created. If you want a to have 726 first you need to create it as OrderedDict from beginning.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM