简体   繁体   中英

Building a linear hierarchy of a tree with Python

Have the following json:

{
    'a': {
        'children': [],
        'name': 'a'
    },
    'b': {
        'children': [{
                'x': {
                    'children': [],
                    'name': 'x'
                }
            }, {
                'y': {
                    'children': [{
                        'z': {
                            'children': [],
                            'name': 'z'
                        }
                    }]
                }]
        }
    }

end result should be:

a 
b -> x
b -> y -> z

I'm having trouble wrapping my mind around the recursive function I'll need to solve this. Are linked lists the solution to this? There is an unknown level of recursion in my data, so the function should just keep returning any child nodes. I have no problem listing out all the nodes recursively, however, keeping track of them is my problem.

def print_tree(tree, prev=None, child=False):
    for node in tree:
        print(node['name'])
        if len(node['children']):          
            print_tree(node['children'])




print_tree(tree_data)

What logic am I missing here to keep track of this?

You have heaps problems with your code

  1. JSON invalid, missing a closing } before the last ]

  2. Your b and y nodes do not have a name set

  3. Your structure is inconsistent: each item of your data is a node, each element in children is a node, but your data itself is not a node. Furthermore, your outermost data uses a { 'a': ..., 'b': ... } structure, but children use a [ { 'a': ... }, { 'b': ... } ] structure.

  4. dict-wrapping nodes make it hard to get the actual nodes out. Ie, if I give you { 'x': nodeX } where 'x' is an unknown value, it's difficult for your program to extract nodeX

We start by fixing 1 and 2

data = \
  { 'a': { 'children': []
         , 'name': 'a'
         }
  , 'b': { 'children': [ { 'x': { 'children': []
                                , 'name': 'x'
                                }
                         }
                       , { 'y': { 'children': [ { 'z': { 'children': []
                                                       , 'name': 'z'
                                                       }
                                                }
                                              ]
                                , 'name': 'y' # you missed this
                                }
                         } # you missed this
                       ]
          , 'name': 'b'  # you missed this
          }
  }

Then we fix 3 by making a uniform structure with a root node

root = \
  { 'root': { 'children': [ {k:v} for (k,v) in data.items() ]
            , 'name': 'root'
            }
  }

Then we fix 4 with an unwrap_node helper

def unwrap_node (wrapped_node):
  node, *_ = wrapped_node.values()
  if 'children' in node and len (node['children']) > 0:
    return { 'name': node['name']
           , 'children': [ unwrap_node(n) for n in node['children'] ]
           }
  else:
    return node 

Now we get to the meat of your problem. We write a generic traverse function that simply yields an ancestor path ( list ) for each node in your tree

def traverse (node, path = []):
  if 'children' in node and len (node['children']) > 0:
    for n in node['children']:
      yield from traverse (n, path + [ node ])
  else:
    yield path + [ node ]

Using each ancestor path, we can easily join the node by name property and separate using "->"

for path in traverse (unwrap_node (root)):
  print (" -> ".join (node['name'] for node in path))

# root -> a
# root -> b -> x
# root -> b -> y -> z

Lastly, achieve your desired output writing print_tree similar to our loop above. We can filter out the printing of root -> ... as well

def print_tree (node):    
  for path in traverse (unwrap_node (node)):
    print (" -> ".join (n['name'] for n in path if n['name'] is not 'root'))

print_tree (root)
# a
# b -> x
# b -> y -> z

If you fix the severe structural problems with your JSON, you can avoid having to deal with the runaround

If I were doing this I would collect the paths in lists and then build the strings afterward. This has the advantage of making it trivial to change what you want to do with those paths (eg change the output format, pass them to another function, etc.) without having to change your logic.

To do this I would make a helper function that handles building the paths and have the function I plan to call use just collect/transform the results. So something like:

# this function collects the paths as lists (e.g. ['b', 'y', 'z']) and returns a list of those paths
def get_paths(tree):
  paths = []
  for branch in tree:
    name = tree[branch]['name']
    children = tree[branch]['children']
    if len(children):
      # this mostly accounts for the fact that the children are a dictionary in a list
      for node in children:
        # get the paths from the children
        sub_paths = get_paths(node)
        # add this element to the beginning of those paths
        for path in sub_paths:
          path.insert(0, name)
        # transfer modified sub-paths to list of paths
        paths.extend(sub_paths)
    else:
      # leaf node, add as a path with one element
      paths.append([name])
  return paths

# this function uses the above function to get the paths and then prints the results as desired
def print_tree(tree):
  paths = get_paths(tree)
  print(paths)
  # do whatever you want with the paths
  for path in paths:
    print(' -> '.join(path))

Which for your input (modified to add a name for 'y' and 'b'), gives:

a
b -> x
b -> y -> z

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM