简体   繁体   English

如何在Python中剪切一个非常“深入”的json或字典?

[英]How to cut a very “deep” json or dictionary in Python?

I have a json object which is very deep. 我有一个非常深的json对象。 In other words I have a dictionary, containing dictionaries containing dictionaries and so on many times. 换句话说,我有一个字典,包含多次包含词典的词典等等。 So, one can imagine it as a huge tree in which some nodes are very far from the root node. 因此,可以将它想象成一棵巨大的树,其中一些节点离根节点很远。

Now I would like to cut this tree so that I have in it only nodes that are separated not more than N steps from the root. 现在我想切割这棵树,这样我就只有从根部分开不超过N步的节点。 Is there a simple way to do it? 有一个简单的方法吗?

For example if I have: 例如,如果我有:

{'a':{'d':{'e':'f', 'l':'m'}}, 'b':'c', 'w':{'x':{'z':'y'}}}

And I want to keep only nodes that are 2 steps from the root, I should get: 我想只保留距离根2步的节点,我应该得到:

{'a':{'d':'o1'}, 'b':'c', 'w':{'x':'o2'}}

So, I just replace the far standing dictionaries by single values. 所以,我只用单个值替换远端词典。

Given that your data is very deep, you may very well run into stack limits with recursion. 鉴于您的数据非常深,您可能会在递归时遇到堆栈限制。 Here's an iterative approach that you might be able to clean up and polish a bit: 这是一种迭代方法,您可以清理和润色一下:

import collections

def cut(dict_, maxdepth, replaced_with=None):
    """Cuts the dictionary at the specified depth.

    If maxdepth is n, then only n levels of keys are kept.
    """
    queue = collections.deque([(dict_, 0)])

    # invariant: every entry in the queue is a dictionary
    while queue:
        parent, depth = queue.popleft()
        for key, child in parent.items():
            if isinstance(child, dict):
                if depth == maxdepth - 1:
                    parent[key] = replaced_with
                else:
                    queue.append((child, depth+1))
def prune(tree, max, current=0):
    for key, value in tree.items():
        if isinstance(value, dict):
            if current == max:
                tree[key] = None
            else:
                prune(value, max, current + 1)

This is mostly an example to get you started. 这主要是一个让你入门的例子。 It prunes the dictionary in place. 它修剪了字典。 Eg: 例如:

>>> dic = {'a':{'d':{'e':'f', 'l':'m'}}, 'b':'c', 'w':{'x':{'z':'y'}}}
>>> prune(dic, 1)
>>> dic
{'b': 'c', 'w': {'x': None}, 'a': {'d': None}}

You could do something like: 你可以这样做:

initial_dict = {'a':{'d':{'e':'f', 'l':'m'}}, 'b':'c', 'w':{'x':{'z':'y'}}}
current_index = 0
for item in initial_dict.items():
    if isinstance(item[1], dict):
        current_index += 1
        initial_dict[item[0]] = {key:'o'+str(current_index) for key in item[1].keys()}

I believe one problem with this code is that for multiple keyed second level dicts (example follows) you would get the same value, but you can adapt the code to work it around. 我相信这段代码的一个问题是,对于多个键控二级dicts(后面的例子),你会获得相同的值,但你可以调整代码来解决它。

Eg.: 例如。:

# suppose you have this dict initially
initial_dict = {'a':{'d':{'e':'f', 'l':'m'}}, 'b':'c', 'w':{'x':{'z':'y'}, 'b':{'p':'r'}}}
# you would get
initial_dict = {'a':{'d':'o1'}}, 'b':'c', 'w':{'x':'o2', 'b':'o2'}}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM