[英]Creating a nested dictionary from a flattened dictionary
我有一個扁平化的字典,我想把它做成一個嵌套的字典,形式如下
flat = {'X_a_one': 10,
'X_a_two': 20,
'X_b_one': 10,
'X_b_two': 20,
'Y_a_one': 10,
'Y_a_two': 20,
'Y_b_one': 10,
'Y_b_two': 20}
我想把它轉換成表格
nested = {'X': {'a': {'one': 10,
'two': 20},
'b': {'one': 10,
'two': 20}},
'Y': {'a': {'one': 10,
'two': 20},
'b': {'one': 10,
'two': 20}}}
平面詞典的結構是這樣的,不應該有任何歧義的問題。 我希望它適用於任意深度的詞典,但性能並不是真正的問題。 我見過很多用於壓平嵌套字典的方法,但基本上沒有用於嵌套壓平字典的方法。 存儲在字典中的值要么是標量要么是字符串,絕不是可迭代的。
到目前為止,我已經得到了可以接受輸入的東西
test_dict = {'X_a_one': '10',
'X_b_one': '10',
'X_c_one': '10'}
轉至 output
test_out = {'X': {'a_one': '10',
'b_one': '10',
'c_one': '10'}}
使用代碼
def nest_once(inp_dict):
out = {}
if isinstance(inp_dict, dict):
for key, val in inp_dict.items():
if '_' in key:
head, tail = key.split('_', 1)
if head not in out.keys():
out[head] = {tail: val}
else:
out[head].update({tail: val})
else:
out[key] = val
return out
test_out = nest_once(test_dict)
但是我無法弄清楚如何將它變成遞歸創建字典所有級別的東西。
任何幫助,將不勝感激!
(至於我為什么要這樣做:我有一個文件,其結構相當於一個嵌套的字典,我想將這個文件的內容存儲在一個 NetCDF 文件的屬性字典中,稍后再檢索它。但是 NetCDF 只允許你把平面字典作為屬性,所以我想取消我以前存儲在 NetCDF 文件中的字典。)
這是我的看法:
def nest_dict(flat):
result = {}
for k, v in flat.items():
_nest_dict_rec(k, v, result)
return result
def _nest_dict_rec(k, v, out):
k, *rest = k.split('_', 1)
if rest:
_nest_dict_rec(rest[0], v, out.setdefault(k, {}))
else:
out[k] = v
flat = {'X_a_one': 10,
'X_a_two': 20,
'X_b_one': 10,
'X_b_two': 20,
'Y_a_one': 10,
'Y_a_two': 20,
'Y_b_one': 10,
'Y_b_two': 20}
nested = {'X': {'a': {'one': 10,
'two': 20},
'b': {'one': 10,
'two': 20}},
'Y': {'a': {'one': 10,
'two': 20},
'b': {'one': 10,
'two': 20}}}
print(nest_dict(flat) == nested)
# True
output = {}
for k, v in source.items():
# always start at the root.
current = output
# This is the part you're struggling with.
pieces = k.split('_')
# iterate from the beginning until the second to last place
for piece in pieces[:-1]:
if not piece in current:
# if a dict doesn't exist at an index, then create one
current[piece] = {}
# as you walk into the structure, update your current location
current = current[piece]
# The reason you're using the second to last is because the last place
# represents the place you're actually storing the item
current[pieces[-1]] = v
這是使用collections.defaultdict
的一種方法,從以前的答案中大量借鑒。 有3個步驟:
defaultdict
對象的嵌套defaultdict
。flat
輸入字典中的項目。_
拆分鍵得到的結構構建defaultdict
結果,使用getFromDict
迭代結果字典。這是一個完整的例子:
from collections import defaultdict
from functools import reduce
from operator import getitem
def getFromDict(dataDict, mapList):
"""Iterate nested dictionary"""
return reduce(getitem, mapList, dataDict)
# instantiate nested defaultdict of defaultdicts
tree = lambda: defaultdict(tree)
d = tree()
# iterate input dictionary
for k, v in flat.items():
*keys, final_key = k.split('_')
getFromDict(d, keys)[final_key] = v
{'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},
'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}
作為最后一步,您可以將defaultdict
轉換為常規dict
,盡管通常這一步不是必需的。
def default_to_regular_dict(d):
"""Convert nested defaultdict to regular dict of dicts."""
if isinstance(d, defaultdict):
d = {k: default_to_regular_dict(v) for k, v in d.items()}
return d
# convert back to regular dict
res = default_to_regular_dict(d)
其他答案更清晰,但既然您提到了遞歸,我們確實有其他選擇。
def nest(d):
_ = {}
for k in d:
i = k.find('_')
if i == -1:
_[k] = d[k]
continue
s, t = k[:i], k[i+1:]
if s in _:
_[s][t] = d[k]
else:
_[s] = {t:d[k]}
return {k:(nest(_[k]) if type(_[k])==type(d) else _[k]) for k in _}
您可以使用itertools.groupby
:
import itertools, json
flat = {'Y_a_two': 20, 'Y_a_one': 10, 'X_b_two': 20, 'X_b_one': 10, 'X_a_one': 10, 'X_a_two': 20, 'Y_b_two': 20, 'Y_b_one': 10}
_flat = [[*a.split('_'), b] for a, b in flat.items()]
def create_dict(d):
_d = {a:list(b) for a, b in itertools.groupby(sorted(d, key=lambda x:x[0]), key=lambda x:x[0])}
return {a:create_dict([i[1:] for i in b]) if len(b) > 1 else b[0][-1] for a, b in _d.items()}
print(json.dumps(create_dict(_flat), indent=3))
輸出:
{
"Y": {
"b": {
"two": 20,
"one": 10
},
"a": {
"two": 20,
"one": 10
}
},
"X": {
"b": {
"two": 20,
"one": 10
},
"a": {
"two": 20,
"one": 10
}
}
}
另一個沒有導入的非遞歸解決方案。 拆分插入 flat dict 的每個鍵值對和映射 flat dict 的鍵值對之間的邏輯。
def insert(dct, lst):
"""
dct: a dict to be modified inplace.
lst: list of elements representing a hierarchy of keys
followed by a value.
dct = {}
lst = [1, 2, 3]
resulting value of dct: {1: {2: 3}}
"""
for x in lst[:-2]:
dct[x] = dct = dct.get(x, dict())
dct.update({lst[-2]: lst[-1]})
def unflat(dct):
# empty dict to store the result
result = dict()
# create an iterator of lists representing hierarchical indices followed by the value
lsts = ([*k.split("_"), v] for k, v in dct.items())
# insert each list into the result
for lst in lsts:
insert(result, lst)
return result
result = unflat(flat)
# {'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},
# 'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}
這是一個合理可讀的遞歸結果:
def unflatten_dict(a, result = None, sep = '_'):
if result is None:
result = dict()
for k, v in a.items():
k, *rest = k.split(sep, 1)
if rest:
unflatten_dict({rest[0]: v}, result.setdefault(k, {}), sep = sep)
else:
result[k] = v
return result
flat = {'X_a_one': 10,
'X_a_two': 20,
'X_b_one': 10,
'X_b_two': 20,
'Y_a_one': 10,
'Y_a_two': 20,
'Y_b_one': 10,
'Y_b_two': 20}
print(unflatten_dict(flat))
# {'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},
# 'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}
這是基於上述幾個答案,不使用導入,僅在 python 3 中測試。
安裝命令
pip install ndicts
然后在你的腳本中
from ndicts.ndicts import NestedDict
flat = {'X_a_one': 10,
'X_a_two': 20,
'X_b_one': 10,
'X_b_two': 20,
'Y_a_one': 10,
'Y_a_two': 20,
'Y_b_one': 10,
'Y_b_two': 20}
nd = NestedDict()
for key, value in flat.items():
n_key = tuple(key.split("_"))
nd[n_key] = value
如果您需要將結果作為字典:
>>> nd.to_dict()
{'X': {'a': {'one': 10, 'two': 20},
'b': {'one': 10, 'two': 20}},
'Y': {'a': {'one': 10, 'two': 20},
'b': {'one': 10, 'two': 20}}}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.