简体   繁体   中英

Removing common elements from a dictionary of lists in python

I have a dictionary of lists and the lists contain dictionaries like so:

my_dict = {
'list1': [{'catch': 100, 'id': '1'}, {'catch': 101, 'id': '2'}, 
          {'catch': 50, 'id': '1'}], 
'list2': [{'catch': 189, 'id': '1'}, {'catch': 120, 'id': '12'}], 
'list3': [{'catch': 140, 'id': '1'}, {'catch': 10, 'id': '100'}]
}

What is the most pythonic way of removing the list items with commin 'id' values and storing them in a separate list? So the output would be something like this:

my_dict = {
'list1': [{'catch': 101, 'id': '2'}], 
'list2': [{'catch': 120, 'id': '12'}], 
'list3': [ {'catch': 10, 'id': '100'}],
'list4': [{'catch': 100, 'id': '1'}, , {'catch': 50, 'id': '1'}, 
          {'catch': 189, 'id': '1'}, {'catch': 140, 'id': '1'}]
}

In my program I have 7 lists similar to this, and if an 'id' appears in two or more of these lists, I want to store all appearances of an item with that 'id' in the 8th list for further processing

with regards, finnurtorfa

Consider restructuring your data into something like this:

>>> import itertools
>>> { k: [d['catch'] for d in v] for k, v in itertools.groupby(sorted(itertools.chain(*my_dict.itervalues()), key=lambda d: d['id']), lambda d: d['id']) }
{'1': [100, 50, 140, 189], '2': [101], '100': [10], '12': [120]}

You haven't described what your data represents, so this may not be appropriate for you. But the tools used ( chain and groupby from itertools ) should at least give you some ideas.

Edit: I used the sample answer from the question in my testing by accident. Fixed by adding sorting to the input to groupby .

>>> get_id = operator.itemgetter("id")
>>> flattened_dict = itertools.chain.from_iterable(my_dict.values())
>>> groups = itertools.groupby(sorted(flattened_dict, key=get_id), get_id)
>>> {k: list(v) for k, v in groups}
{'1': [{'catch': 100, 'id': '1'},
  {'catch': 50, 'id': '1'},
  {'catch': 140, 'id': '1'},
  {'catch': 189, 'id': '1'}],
 '100': [{'catch': 10, 'id': '100'}],
 '12': [{'catch': 120, 'id': '12'}],
 '2': [{'catch': 101, 'id': '2'}]}

Explanation:

  • get_id is a function that takes an object x and returns x["id"] .
  • flattened_dict is just an iterable over all the lists (ie concatenating all the .values() of my_dict
  • Now we sort flattened_dict with the key function get_id -- that is, sort by ID -- and group the result by id.

This basically works because itertools.groupby is awesome.

Something along the following line:

my_dict = {
'list1': [{'catch': 100, 'id': '1'}, {'catch': 101, 'id': '2'}, 
      {'catch': 50, 'id': '1'}], 
'list2': [{'catch': 189, 'id': '1'}, {'catch': 120, 'id': '12'}], 
'list3': [{'catch': 140, 'id': '1'}, {'catch': 10, 'id': '100'}]
}

from itertools import groupby

sub = {}
for k in my_dict:
 for kk, g in groupby( my_dict[k], lambda v: v["id"] ):
   if not kk in sub:
    sub[kk] = []
   sub[kk] = sub[kk] + list( g )

print sub

{'1': [{'catch': 100, 'id': '1'}, {'catch': 50, 'id': '1'}, {'catch': 140, 'id': '1'}, {'catch': 189, 'id': '1'}], '12': [{'catch': 120, 'id': '12'}], '100': [{'catch': 10, 'id': '100'}], '2': [{'catch': 101, 'id': '2'}]}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM