简体   繁体   中英

Transform Python dictionary into List

With Python 3.6, what's the most efficient way to transform this dictionary into a List? I've tried to use a lot of loops, but it doesn't look efficient at all, it takes some time. This is the original dictionary:

d = {'Owner': [{'login': 'AAAA', 'mail': 'AAAAA@gmail.com'},
               {'login': 'BBBB', 'mail': 'BBBBB@gmail.com'},
               {'login': 'CCCC', 'mail': 'CCCC@gmail.com'}],
     'Stakeholder': [{'login': 'DDDD', 'mail': 'DDDD@gmail.com'},
                     {'login': 'AAAA', 'mail': 'AAAA@gmail.com'}],
     'Team': [{'login': 'CCCC', 'mail': 'CCCC@gmail.com'},
              {'login': 'BBBB', 'mail': 'BBBB@gmail.com'}]}

This is the goal:

[{'login': 'AAAA', 'mail': 'AAAAA@gmail.com', 'roles': ['Owner', 'Stakeholder']},
 {'login': 'BBBB', 'mail': 'BBBBB@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'CCCC', 'mail': 'CCCC@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'DDDD', 'mail': 'DDDD@gmail.com', 'roles': ['Stakeholder']}]

Thanks!

Edit 1: So far I could get a list of unique users:

list_users = []

for role, users in old_dict.items():
    for user in users:
        list_users.append(user)

unique_list_of_users = []
for i in range(len(list_users)):
    if list_users[i] not in list_users[i + 1:]:
        unique_list_of_users.append(list_users[i])

for user in unique_list_of_users:
    user["role"] = []

Try this:

output = []
for k, v in d.items():
    for dct in v:
        for x in output:
            if x['login'] == dct['login']:
                x['roles'].append(k)
                break
        else:
            output.append({**dct, **{'roles': [k]}})
print(output)

Output:

[{'login': 'AAAA', 'mail': 'AAAAA@gmail.com', 'roles': ['Owner', 'Stakeholder']},
 {'login': 'BBBB', 'mail': 'BBBBB@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'CCCC', 'mail': 'CCCC@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'DDDD', 'mail': 'DDDD@gmail.com', 'roles': ['Stakeholder']}]

Note: you specified you were on Python 3.6, so {**dct, **{'roles': [k]}} will work for you. It will not work on Python 3.4 or lower. If you are on 3.4 or lower, use:

dct.update({'roles': [k]})
output.append(dct)

Since you have a key in the login and mail field, you can use those to quickly build a dictionnary with the values you need and then access them.

d = {
    "Owner": [
        {"login": "AAAA", "mail": "AAAAA@gmail.com"},
        {"login": "BBBB", "mail": "BBBBB@gmail.com"},
        {"login": "CCCC", "mail": "CCCC@gmail.com"},
    ],
    "Stakeholder": [
        {"login": "DDDD", "mail": "DDDD@gmail.com"},
        {"login": "AAAA", "mail": "AAAA@gmail.com"},
    ],
    "Team": [
        {"login": "CCCC", "mail": "CCCC@gmail.com"},
        {"login": "BBBB", "mail": "BBBB@gmail.com"},
    ],
}

tmp = {}

for k, v in d.items():
    for dd in v:
        try:
            tmp[dd["login"]]["roles"].append(k)
        except KeyError:
            tmp[dd["login"]] = dd
            tmp[dd["login"]].update({"roles": [k]})   

list(tmp.values())

gives

[{'login': 'AAAA', 'mail': 'AAAAA@gmail.com', 'roles': ['Owner', 'Stakeholder']},
 {'login': 'BBBB', 'mail': 'BBBBB@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'CCCC', 'mail': 'CCCC@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'DDDD', 'mail': 'DDDD@gmail.com', 'roles': ['Stakeholder']}]

Or use collections.defaultdict :

import collections
import pprint
 
d = {
    "Owner": [
        {"login": "AAAA", "mail": "AAAAA@gmail.com"},
        {"login": "BBBB", "mail": "BBBBB@gmail.com"},
        {"login": "CCCC", "mail": "CCCC@gmail.com"},
    ],
    "Stakeholder": [
        {"login": "DDDD", "mail": "DDDD@gmail.com"},
        {"login": "AAAA", "mail": "AAAA@gmail.com"},
    ],
    "Team": [
        {"login": "CCCC", "mail": "CCCC@gmail.com"},
        {"login": "BBBB", "mail": "BBBB@gmail.com"},
    ],
}

res = collections.defaultdict(dict)
for k, v in d.items():
    for obj in d[k]:
        if not obj['login'] in res:
            res[obj['login']].update(obj)
        res[obj['login']].setdefault('roles', []).append(k)

pprint.pprint(list(res.values()))

Output:

[{'login': 'AAAA', 'mail': 'AAAAA@gmail.com', 'roles': ['Owner', 'Stakeholder']},
 {'login': 'BBBB', 'mail': 'BBBBB@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'CCCC', 'mail': 'CCCC@gmail.com', 'roles': ['Owner', 'Team']},
 {'login': 'DDDD', 'mail': 'DDDD@gmail.com', 'roles': ['Stakeholder']}]

Here is another solution using itertools -

from itertools import chain, groupby
flattened_items = sorted(chain.from_iterable([list(zip([k]*len(v), v)) for item in d.items() for k, v in (item,)]), key=lambda x: x[1]['login'])

for k, v in groupby(flattened_items, key=lambda x: x[1]['login']):
    all_roles, all_dicts = zip(*v)
    new_dict = {kk: vv for d in reversed(all_dicts) for kk, vv in d.items()}
    new_dict['roles'] = list(all_roles)
    print(new_dict)

What I am doing is to flatten the d.items() into (key, value) pairs, sort it by the values and then aggregate using itertools.groupby

Output

{'login': 'AAAA', 'mail': 'AAAAA@gmail.com', 'roles': ['Owner', 'Stakeholder']}
{'login': 'BBBB', 'mail': 'BBBBB@gmail.com', 'roles': ['Owner', 'Team']}
{'login': 'CCCC', 'mail': 'CCCC@gmail.com', 'roles': ['Owner', 'Team']}
{'login': 'DDDD', 'mail': 'DDDD@gmail.com', 'roles': ['Stakeholder']}

I believe the most optimal way is to let your dicts that belong to the list of each role hashable. Then, you can make them be the key of another dict that will tell the roles of each login+mail dict. The python code would be:

class Hashabledict(dict):
    def __hash__(self):
        return hash(frozenset(self))


def solve(d):
    roles = dict()

    for key, value in d.items():
        for cur_dict in value:
            hashable_cur_dict = Hashabledict(cur_dict)

            if hashable_cur_dict not in roles:
                roles[hashable_cur_dict] = {key}
            else:
                roles[hashable_cur_dict].add(key)

    answer = list()

    for key, value in roles.items():
        key['roles'] = list(value)
        answer.append(key)

    return answer

An example of use is below:

if __name__ == "__main__":
    d = {'Owner': [{'login': 'AAAA', 'mail': 'AAAA@gmail.com'},
                   {'login': 'BBBB', 'mail': 'BBBB@gmail.com'},
                   {'login': 'CCCC', 'mail': 'CCCC@gmail.com'}],
         'Stakeholder': [{'login': 'DDDD', 'mail': 'DDDD@gmail.com'},
                         {'login': 'AAAA', 'mail': 'AAAA@gmail.com'}],
         'Team': [{'login': 'CCCC', 'mail': 'CCCC@gmail.com'},
                  {'login': 'BBBB', 'mail': 'BBBB@gmail.com'}]}

    test = solve(d)
    print(test)

The output is:

[{'login': 'AAAA', 'mail': 'AAAA@gmail.com', 'roles': ['Owner', 'Stakeholder']}, {'login': 'BBBB', 'mail': 'BBBB@gmail.com', 'roles': ['Owner', 'Team']}, {'login': 'CCCC', 'mail': 'CCCC@gmail.com', 'roles': ['Owner', 'Team']}, {'login': 'DDDD', 'mail': 'DDDD@gmail.com', 'roles': ['Stakeholder']}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM