简体   繁体   中英

Python - itertools.groupby 2

Just having trouble with itertools.groupby. Given a list of dictionaries,

my_list= [ 
"AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]

from this list, I expected to create a dictionary, where the key is the shortest name and the values are the longest names

example

[
{"Legacy" : "AD01", "rphy" : ["AD01AA", "AD01AB", "AD01AC", "AD01AD"]},
{"Legacy" : "AD02", "rphy" : ["AD02AA", "AD02AB", "AD02AC"]},
]

could you help me please

You can use itertools.groupby , with some next s:

from itertools import groupby

my_list= ["AD01", "AD01AA", "AD01AB", "AD01AC", "AD01AD","AD02", "AD02AA", "AD02AB", "AD02AC"]

groups = groupby(my_list, len)
output = [{'Legacy': next(g), 'rphy': list(next(groups)[1])} for _, g in groups]

print(output)
# [{'Legacy': 'AD01', 'rphy': ['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']},
#  {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]

This is not robust to reordering of the input list.

Also, if there is some "gap" in the input, eg, if "AD01" does not have corresponding 'rphy' entries, then it will throw a StopIteration error as you have found out. In that case you can use a more conventional approach:

from itertools import groupby

my_list= ["AD01", "AD02", "AD02AA", "AD02AB", "AD02AC"]

output = []
for item in my_list:
    if len(item) == 4:
        dct = {'Legacy': item, 'rphy': []}
        output.append(dct)
    else:
        dct['rphy'].append(item)

print(output)
# [{'Legacy': 'AD01', 'rphy': []}, {'Legacy': 'AD02', 'rphy': ['AD02AA', 'AD02AB', 'AD02AC']}]

One approach would be: (see the note at the end of the answer)

from itertools import groupby
from pprint import pprint

my_list = [
    "AD01",
    "AD01AA",
    "AD01AB",
    "AD01AC",
    "AD01AD",
    "AD02",
    "AD02AA",
    "AD02AB",
    "AD02AC",
]

res = []
for _, g in groupby(my_list, len):
    lst = list(g)
    if len(lst) == 1:
        res.append({"Legacy": lst[0], "rphy": []})
    else:
        res[-1]["rphy"].append(lst)

pprint(res)

output:

[{'Legacy': 'AD01', 'rphy': [['AD01AA', 'AD01AB', 'AD01AC', 'AD01AD']]},
 {'Legacy': 'AD02', 'rphy': [['AD02AA', 'AD02AB', 'AD02AC']]}]

This assumes that your data always starts with your desired key(the name which has the smallest name compare to the next values).

Basically in every iteration you check then length of the created list from groupby . If it is 1 , this mean it's your key, if not, it will add the next items to the dictionary.

Note: This code would break if there aren't at least 2 names with the length larger than the keys between two keys.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM