简体   繁体   中英

Use python to recurse a flat file and build a hierarchy

I have a flat text file that represents a hierarchy. It looks similar to this:

0 tom (1)
1   janet (8)
2     harry (1)
3       jules (1)
3       jacob (1)
1   mary (13)
2     jeff (1)
3       sam (2)
1   bob (28)
2     dick (1)

I want to read this in and build a nested dictionary (or some kind of data structure) to represent the hierarchy so it is easier to manage but I can't wrap my head around how to iterate and create a data structure. Maybe recursion?

The first number is the level of the hierarchy, the word is the name I want to store and the value in the parenthesis is the quantity that I also want to store.

I'd like to end up with something similar to this:

{
  "tom": {
    "quantity": 1,
    "names": {
      "janet": {
        "quantity": 8,
        "names": {
          "harry": {
            "quantity": 1,
            "names": {
              "jules": {
                "quantity": 1
              },
              "jacob": {
                "quantity": 1
              }
            }
          }
        }
      },
      "mary": {
        "quantity": 13,
        "names": {
          "jeff": {
            "quantity": 1,
            "names": {
              "sam": {
                "quantity": 2
              }
            }
          }
        }
      },
      "bob": {
        "quantity": 28,
        "names": {
          "dick": {
            "quantity": 1
          }
        }
      }
    }
  }
}

You can use recursion:

import re
with open('test_hierarchy.txt') as f:
   d = [[int((k:=re.findall('\d+|\w+', i))[0]), k[1], int(k[-1])] for i in f]

def to_tree(data):
   if not data:
      return {}
   r, _key, _val = {}, None, []
   for a, b, c in data:
      if not a:
         if _key is not None:
            r[_key[0]] = {'quantity':_key[-1], 'names':to_tree(_val)}
         _key, _val = (b, c), []
      else:
         _val.append([a-1, b, c])
   r = {**r, _key[0]:{'quantity':_key[-1], 'names':to_tree(_val)}}
   return {a:{'quantity':b['quantity']} if not b['names'] else b for a, b in r.items()}

import json
print(json.dumps(to_tree(d), indent=4))

Output:

{
  "tom": {
     "quantity": 1,
     "names": {
        "janet": {
            "quantity": 8,
            "names": {
                "harry": {
                    "quantity": 1,
                    "names": {
                        "jules": {
                            "quantity": 1
                        },
                        "jacob": {
                            "quantity": 1
                        }
                    }
                }
            }
        },
        "mary": {
            "quantity": 13,
            "names": {
                "jeff": {
                    "quantity": 1,
                    "names": {
                        "sam": {
                            "quantity": 2
                        }
                    }
                }
            }
        },
        "bob": {
            "quantity": 28,
            "names": {
                "dick": {
                    "quantity": 1
                }
             }
          }
      }
   }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM