简体   繁体   中英

How to rearrange an Ordered Dictionary with a based on part of the key from a list

I am rearranging some Ordered Dictionary based on the key from a list. Such in:

old_OD = OrderedDict([('cat_1',1), 
            ('dog_1',2), 
            ('cat_2',3),
            ('fish_1',4), 
            ('dog_2',5)])

Now I have a list of the group's order.

order = ['dog', 'cat', 'fish']

and get the result with the items in the dictionary grouped together, as such:

new_OD = OrderedDict([('dog_1',2),
            ('dog_2',5), 
            ('cat_1',1), 
            ('cat_2',3),
            ('fish_1',4)])

I found some excellent related question How to reorder OD based on list and Re-ordering OrderedDict and I am going with the solution in the second link

new_od = OrderedDict([(k, None) for k in order if k in old_od])
new_od.update(old_od)

Now, in my case, "k" is not exact match and desired key value for the new_od, how should I modify to construct the new od?

EDIT: So what happen if there is no underscore that mark the location of the keyword, like we have "Big_cat_3" or "dog_black_2"? The keyword could be anywhere in the string. Once the key are grouped together, alpha-numerical order is not needed.

Here I am sharing two variants of solution for this.

1. For keys with same prefix, keep the order of initial OrderedDict

Here I am using list comprehension to iterate the order list and OrderDict . Based on comparison, we are passing list of tuples with desired order for creating OrderedDict object:

>>> from collections import OrderedDict
>>> old_OD = OrderedDict([('cat_1',1),
...             ('dog_1',2),
...             ('cat_2',3),
...             ('fish_1',4),
...             ('dog_2',5)])
>>> order = ['dog', 'cat', 'fish']

>>> new_OD = OrderedDict([(k,v) for o in order for k, v in old_OD.items() if k.startswith(o+'_')])
#                                              to match the prefix pattern of <key> + "_" ^ 

where new_OD will hold:

OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])

2. For keys with same prefix, perform lexicographical sorting of elements

We may modify the above solution using sorted and itertools.chain with nested list comprehension to achieve this as:

>>> from itertools import chain

>>> new_OD = OrderedDict(chain(*[sorted([(k,v) for k, v in old_OD.items() if k.startswith(o+'_')]) for o in order]))

where new_OD will hold:

OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])

You can build a dict that maps each item in order to its index, and then use the sorted function with a key function that maps the substring of the each key in old_OD that appears in the keys of the mapping dict to the corresponding index using the mapping dict:

keys = {k: i for i, k in enumerate(order)}
OrderedDict(sorted(old_OD.items(), key=lambda t: keys.get(next(i for i in t[0].split('_') if i in keys))))

This returns:

OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])

You can use the function groupby() with a sorted dictionary:

from collections import OrderedDict
from itertools import groupby, chain
from operator import itemgetter

ld_OD = OrderedDict([('cat_1',1), 
    ('dog_1',2), 
    ('cat_2',3),
    ('fish_1',4), 
    ('dog_2',5)])

order = ['dog', 'cat', 'fish']

gb = groupby(sorted(ld_OD.items()), key=lambda t: t[0].split('_')[0])
gb = {k: list(g) for k, g in gb}
OrderedDict(chain.from_iterable(itemgetter(*order)(gb)))
# OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])

A more efficient approach to solve this problem in a time complexity of O(n) (instead of O(n log n) with sorting) is to build a dict that maps the substring of each key that appears in order (which should be converted to a set for efficient lookups) to a list of belonging key-value pairs from old_OD , and then build the new OrderedDict by iterating an index through a range of the length of order and output to the OrderedDict constructor the values in the mapping dict keyed by the value of order at the index:

keys = set(order)
mapping = {}
for k, v in old_OD.items():
    mapping.setdefault(next(i for i in k.split('_') if i in keys), []).append((k, v))
OrderedDict(t for i in range(len(order)) for t in mapping[order[i]])

This returns:

OrderedDict([('dog_1', 2), ('dog_2', 5), ('cat_1', 1), ('cat_2', 3), ('fish_1', 4)])

Here is another approach using regex and partial functions.

import re
from operator import itemgetter
from functools import partial

first = itemgetter(0)
pattern = '|'.join(order) # 'dog|cat|fish'

def group(order, pattern, txt):
    item = first(txt)
    res = re.search(pattern, item)
    return order.index(res.group(0))

p = partial(group, order, pattern)

OrderedDict(sorted(old_OD.items(), key=p))

OrderedDict([('dog_1', 2),
             ('dog_2', 5),
             ('cat_1', 1),
             ('cat_2', 3),
             ('fish_1', 4)])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM