I am trying to iterate groups via itertools.groupby in a recursive function to construct nested dictionary from nested lists.
Input
example = [['a', [], 'b', (), 1, None],
['a', [], 'c', (), 0, None],
['a', [], 2, None, None, None],
['a', [], 3, None, None, None],
['a', [], 3, None, None, None],
]
Expected output
output = {'a': [{'b': (1, None)},
{'c': (1, None)},
2, None, None, None, 3, None, None,
None, 3, None, None, None
]
}
The code I am trying
from itertools import chain, groupby
def group_key(lst, level=0):
return lst[level]
def build_dict(data=None, grouper=None):
if grouper is None:
gen = groupby(data, key=group_key)
else:
if any(isinstance(i, list) for i in grouper):
level_down = [l[1:] for l in grouper]
gen = groupby(level_down, key=group_key)
else:
return grouper
for char, group in gen:
group_lst = list(group)
if isinstance(char, str):
value = {char: build_dict(grouper=group_lst)}
elif char == ():
value = tuple(build_dict(grouper=group_lst))
elif char == []:
value = [build_dict(grouper=group_lst)]
else:
value = chain.from_iterable(group_lst)
return value
When I submit the code I get only the first group of in the for char, group in gen: loop. Somehow the function does not continue with the other groups. I am not great in recursive functions so perhaps I am missing something there. This is what the code produces:
In: build_dict(example)
Out: {'a': [{'b': (1, None)}]}
The structure is a bit inconsistant as it presents dictionary content as a list of [key,collection,values...] at the top level but specifies sub-dictionaries without the enclosing list of lists. Despite having to work around this inconsistency, the data structure can be built recursively.
def buildData(content,asValues=False):
if not asValues:
result = dict() # assumes a list of key, model, values...
for k,model,*values in content:
result.setdefault(k,model)
result[k] += type(model)(buildData(values,True))
return result
if len(content)>2 \
and isinstance(content[0],str) and isinstance(content[1],(tuple,list)):
return [buildData([content])] # adapts to match top level structure
if content: # everythoing else produces a list of data items
return content[:1] + buildData(content[1:],True)
return [] # until data exhausted
output:
example = [['a', [], 'b', (), 1, None],
['a', [], 'c', (), 0, None],
['a', [], 2, None, None, None],
['a', [], 3, None, None, None],
['a', [], 3, None, None, None],
]
d = buildData(example)
print(d)
{'a': [{'b': (1, None)},
{'c': (0, None)},
2, None, None, None, 3, None, None, None, 3, None, None, None]}
restructure
This is not a problem for itertools.groupby
. The logic you are using to "group" elements is unique and I would not expect to find a built-in function that meets your exact needs. Below I begin with restructure
which takes each element from example
and produces an output similar to the output you already have -
def restructure(t):
def loop(t, r):
if not t:
return r[0]
if t[-1] == ():
return loop(t[0:-1], tuple(r))
elif t[-1] == []:
return loop(t[0:-1], list(r))
elif isinstance(t[-1], str):
return loop(t[0:-1], ({t[-1]: r},))
else:
return loop(t[0:-1], (t[-1], *r))
return loop(t[0:-1], (t[-1],))
for e in example:
print(restructure(e))
{'a': [{'b': (1, None)}]}
{'a': [{'c': (0, None)}]}
{'a': [2, None, None, None]}
{'a': [3, None, None, None]}
{'a': [3, None, None, None]}
merge
With each element restructured, we now define a way to merge
restructured elements -
def merge(r, t):
if isinstance(r, dict) and isinstance(t, dict):
for (k,v) in t.items():
r[k] = merge(r[k], v)
return r
elif isinstance(r, tuple) and isinstance(t, tuple):
return r + t
elif isinstance(r, list) and isinstance(t, list):
return r + t
else:
return t
a = restructure(example[0])
b = restructure(example[1])
print(merge(a, b))
{'a': [{'b': (1, None)}, {'c': (0, None)}]}
build
Lastly, build
is responsible to tying everything together -
def build(t):
if not t:
return None
elif len(t) == 1:
return restructure(t[0])
else:
return merge(restructure(t[0]), build(t[1:]))
example = \
[ ['a', [], 'b', (), 1, None]
, ['a', [], 'c', (), 0, None]
, ['a', [], 2, None, None, None]
, ['a', [], 3, None, None, None]
, ['a', [], 3, None, None, None]
]
print(build(example))
{'a': [{'b': (1, None)}, {'c': (0, None)}, 2, None, None, None, 3, None, None, None, 3, None, None, None]}
Above, build
is effectively the same as functools.reduce
and map
-
from functools import reduce
def build(t):
if not t:
return None
else:
return reduce(merge, map(restructure, t))
print(build(example))
{'a': [{'b': (1, None)}, {'c': (0, None)}, 2, None, None, None, 3, None, None, None, 3, None, None, None]}
caveat
This answer does nothing to protect against invalid inputs. You are responsible for verifying inputs are valid -
restructure([]) # IndexError
restructure([[], "a"]) # a
restructure(["a", (), [], "b", ()]) # {'a': ({'b': ((),)},)}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.