简体   繁体   中英

How to use map/reduce on Python Ordered dictionary

I have a nested python dictionary like this :

my_dictionary = {"Ab" : {'name': 'usa', 'boolean': 'YES'},
"Ac" : {'name': 'usa', 'boolean': 'NO'},
"Ad": {'name': 'UK', 'boolean': 'NO'},
"Ae": {'name': 'UK', 'boolean': 'NO'}}

I created an ordered dictionary from the above dictionary like this :

from collections import OrderedDict
sorted_dict = OrderedDict(sorted(my_dictionary.iteritems(), key=lambda x: x[1]['name']))
print sorted_dict

This gives:

OrderedDict([("Ab", {'name': 'usa', 'boolean': 'YES'}),
("Ac", {'name': 'usa', 'boolean': 'NO'}),
("Ad", {'name': 'UK', 'boolean': 'NO'}),
("Ae", {'name': 'UK', 'boolean': 'NO'})])

I need to add a new column ('result') to the ordered dictionary. The logic for creating the new column is as follows:

Collect all rows which have the same 'name' : here 'usa' and 'UK'. Then apply reduce method based on 'boolean' column. The function should be binary 'OR' (||).

I tried to apply reduce like this :

reduce(lambda x,y: x['boolean'] or y['boolean']

but got stuck in choosing all the rows with same 'name'.

So the final Ordered dictionary will look like :

OrderedDict([("Ab", {'name': 'usa', 'boolean': 'YES', 'result': 'YES'}),
("Ac", {'name': 'usa', 'boolean': 'NO', 'result': 'YES'}),
("Ad", {'name': 'UK', 'boolean': 'NO', 'result': 'NO'}),
("Ae", {'name': 'UK', 'boolean': 'NO', 'result': 'NO'})])

Let me help you a little bit:

  1. the ordered dictionary you introduce doesn't matter much here. You can omit it and introduce it when you are done with your logic
  2. I would transform "Yes" to True and "No" to False at the very first beginning. Make life easy, not complicated
  3. You can do without lambda and reduce . Python has list comprehension together with the any statement. any applies the Boolean or operator to a list of Boolean values.

I am not sure if I get it well. But I hope this is what you are looking for.

from functools import reduce
from itertools import groupby

def reduceByKey(func, iterable): 
    return map(              
      lambda l: (l[0], reduce(func, map(lambda p: p[1], l[1]))),
      groupby(sorted(iterable, key=lambda p: p[0]), lambda p: p[0])

  # Are you sure you want to do ("YES" or "NO") not (True or False) ?
  lambda x, y: x or y
  map(lambda d: yourDict[d]["name"], yourDict[d]["boolean"], yourDict)

yourDict here is your original dictionary

Heres a method that seems to work with the data you provided, but I am not sure how this has to do with reduce.

from collections import OrderedDict, defaultdict

d = OrderedDict([("Ab", {'name': 'usa', 'boolean': 'YES'}),
                 ("Ac", {'name': 'usa', 'boolean': 'NO'}),
                 ("Ad", {'name': 'UK', 'boolean': 'NO'}),
                 ("Ae", {'name': 'UK', 'boolean': 'NO'})])

def add_result(d, ikey='name', check='boolean', tt='YES', ff='NO'):
    # hold results per ikey
    ikey_results = defaultdict(lambda: ff)
    # first pass to get results
    for v in d.values():
        if v[check] == tt:
            ikey_results[v[ikey]] = tt
    # second pass to embedd results
    for v in d.values():
        v['result'] = ikey_results[v[ikey]]
    return d

print add_result(d)


OrderedDict([('Ab', {'boolean': 'YES', 'name': 'usa', 'result': 'YES'}),
             ('Ac', {'boolean': 'NO', 'name': 'usa', 'result': 'YES'}), 
             ('Ad', {'boolean': 'NO', 'name': 'UK', 'result': 'NO'}), 
             ('Ae', {'boolean': 'NO', 'name': 'UK', 'result': 'NO'})])


from pprint import pprint

my_dictionary = {"Ab": {'name': 'usa', 'boolean': True},
                 "Ac": {'name': 'usa', 'boolean': False},
                 "Ad": {'name': 'UK', 'boolean': False},
                 "Ae": {'name': 'UK', 'boolean': False}}

sub_result = dict()

for x in my_dictionary.values():
    country_name = x['name']
    sub_result[country_name ] = sub_result.get(country_name , False) or x['boolean']

new_dictionary = {k: dict(v.items() + [('result', sub_result[v['name']])]) for k, v in my_dictionary.items()}


No ordered dictionary needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM