簡體   English   中英

python itertools groupby返回元組

[英]python itertools groupby return tuple

我需要解析扁平化的結構並使用提供的鍵列表創建嵌套的結構。 我已經解決了問題,但是我正在尋求改進,我想學習可以在代碼中進行哪些更改。 有人可以復習並使用更好的知識進行重構嗎?

src_data = [
  {
    "key1": "XX",
    "key2": "X111",
    "key3": "1aa",
    "key4": 1
  },
  {
    "key1": "YY",
    "key2": "Y111",
    "key3": "1bb",
    "key4": 11
  },
  {
    "key1": "ZZ",
    "key2": "Z111",
    "key3": "1cc",
    "key4": 2.4
  },
  {
    "key1": "AA",
    "key2": "A111",
    "key3": "1cc",
    "key4": 33333.2122
  },
  {
    "key1": "BB",
    "key2": "B111",
    "key3": "1bb",
    "key4": 2
  },
]

這是我到目前為止開發的代碼,用於創建最終結果。

def plant_tree(ll):
    master_tree = {}

    for i in ll:
        tree = master_tree
        for n in i:
            if n not in tree:
                tree[n] = {}
            tree = tree[n]
    return master_tree



def make_nested_object(tt, var):
    elo = lambda l: reduce(lambda x, y: {y: x}, l[::-1], var)
    return {'n_path': tt, 'n_structure': elo(tt)}



def getFromDict(dataDict, mapList):
    return reduce(operator.getitem, mapList, dataDict)


def set_nested_item(dataDict, mapList, val):
    """Set item in nested dictionary"""
    reduce(getitem, mapList[:-1], dataDict)[mapList[-1]] = val
    return dataDict



def update_tree(data_tree):
    # MAKE NESTED OBJECT
    out = (make_nested_object(k, v) for k,v, in res_out.items())


    for dd in out:
        leaf_data = dd['n_structure']
        leaf_path = dd['n_path']
        data_tree = set_nested_item(data_tree, leaf_path, getFromDict(leaf_data, leaf_path))
    return data_tree

這是此問題中的自定義itemgeter函數

def customed_itemgetter(*args):
    # this handles the case when one key is provided
    f = itemgetter(*args)
    if len(args) > 2:
        return f
    return lambda obj: (f(obj),)

定義嵌套級別

nesting_keys = ['key1', 'key3', 'key2']

grouper = customed_itemgetter(*nesting_keys)
ii = groupby(sorted(src_data, key=grouper), grouper)

res_out = {key: [{k:v for k,v in i.items() if k not in nesting_keys} for i in group] for key,group in ii}
#
ll = ([dd[x] for x in nesting_keys] for dd in src_data)
data_tree = plant_tree(ll)

得到結果

result = update_tree(data_tree)

如何改善我的代碼?

如果itemgetter [Python的DOC]中給出一個單一的元件,它返回單個元件,並且在一個單元組包裹。

但是,我們可以為此構建一個函數,例如:

from operator import itemgetter

def itemgetter2(*args):
    f = itemgetter(*args)
    if len(args) > 2:
        return f
    return lambda obj: (f(obj),)

然后我們可以使用新的itemgetter2 ,例如:

grouper = itemgetter2(*ll)
ii = groupby(sorted(src_data, key=grouper), grouper)

編輯 :但是,根據您的問題,您想要執行多級分組,我們可以為此創建一個函數,例如:

def multigroup(groups, iterable, index=0):
    if len(groups) <= index:
        return list(iterable)
    else:
        f = itemgetter(groups[index])
        i1 = index + 1
        return {
            k: multigroup(groups, vs, index=i1)
            for k, vs in groupby(sorted(iterable, key=f), f)
        }

對於問題中的data_src ,這將生成:

>>> multigroup(['a', 'b'], src_data)
{1: {2: [{'a': 1, 'b': 2, 'z': 3}]}, 2: {3: [{'a': 2, 'b': 3, 'e': 2}]}, 4: {3: [{'a': 4, 'x': 3, 'b': 3}]}}

但是,您可以對list(..)調用中的值進行后處理。 例如,我們可以生成沒有分組列中的元素的字典:

def multigroup(groups, iterable):
    group_set = set(groups)
    fs = [itemgetter(group) for group in groups]
    def mg(iterable, index=0):
        if len(groups) <= index:
            return [
                {k: v for k, v in item.items() if k not in group_set}
                for item in iterable
            ]
        else:
            i1 = index + 1
            return {
                k: mg(vs, index=i1)
                for k, vs in groupby(sorted(iterable, key=fs[index]), fs[index])
            }
    return mg(iterable)

對於給定的樣本輸入,我們得到:

>>> multigroup(['a', 'b'], src_data)
{1: {2: [{'z': 3}]}, 2: {3: [{'e': 2}]}, 4: {3: [{'x': 3}]}}

或對於新的樣本數據:

>>> pprint(multigroup(['key1', 'key3', 'key2'], src_data))
{'AA': {'1cc': {'A111': [{'key4': 33333.2122}]}},
 'BB': {'1bb': {'B111': [{'key4': 2}]}},
 'XX': {'1aa': {'X111': [{'key4': 1}]}},
 'YY': {'1bb': {'Y111': [{'key4': 11}]}},
 'ZZ': {'1cc': {'Z111': [{'key4': 2.4}]}}}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM