Create a tree from multiple nested dictionaries/lists in Python

Question

Preface : To help explain why I am doing this, I will explain the end goal. Essentially I have a list of accounts that are defined in a very specific syntax. Here are some examples:

Assets:Bank:Car
Assets:Bank:House
Assets:Savings:Emergency
Assets:Savings:Goals:Roof
Assets:Reserved

As can be seen above, an account can have any number of parents and children. The end goal is to parse the above accounts into a tree structure in Python that will be used for providing account auto-completion in the Sublime Text Editor (ie, if I typed Assets: and then queried for auto-complete, I would be presented with a list as such: Bank, Savings, Reserved )

The Result: Using the account list from the preface, the desired result in Python would look something like below:

[  
   {  
      "Assets":[  
         {  
            "Bank":[  
               "Car",
               "House"
            ]
         },
         {  
            "Savings":[  
               "Emergency",
               {  
                  "Goals":[  
                     "Roof"
                  ]
               }
            ]
         },
         "Reserved"
      ]
   }
]

Half-Solution: I was able to get two basic accounts to get added together using recursion. This works for adding these two: Assets:Bank:Car and Assets:Bank:House . However, once they start to differ it starts to fall apart and the recursion gets messy, so I'm not sure if it's the best way.

import re

def parse_account(account_str):
    subs = account_str.split(":")

    def separate(subs):
        if len(subs) == 1:
            return subs
        elif len(subs):
            return [{subs[0]: separate(subs[1:])}]

    return separate(subs)

def merge_dicts(a, b):
    # a will be a list with dictionaries and text values and then nested lists/dictionaries/text values
    # b will always be a list with ONE dictionary or text value

    key = b[0].keys()[0] # this is the dictionary key of the only dictionary in the b list

    for item in a: # item is a dictionary or a text value
        if isinstance(item, dict): # if item is a dictionary
            if key in item:
                # Is the value a list with a dict or a list with a text value
                if isinstance(b[0][key][0], str):
                    # Extend the current list with the new value
                    item[key].extend(b[0][key])
                else:
                    # Recurse to the next child
                    merge_dicts(item[key], b[0][key])
            else:


    return a

# Accounts have an "open [name]" syntax for defining them
text = "open Assets:Bank:Car\nopen Assets:Bank:House\nopen Assets:Savings:Emergency\nopen Assets:Savings:Goals:Roof\nopen Assets:Reserved"
EXP = re.compile("open (.*)")
accounts = EXP.findall(text) # This grabs all accounts

# Create a list of all the parsed accounts
dicts = []
for account in accounts:
    dicts.append(parse_account(account))

# Attempt to merge two accounts together
final = merge_dicts(dicts[0], dicts[1])
print final

# In the future we would call: reduce(merge_dicts, dicts) to merge all accounts

I could be going about this in the completely wrong way and I would be interested in differing opinions. Otherwise, does anyone have insight into how to make this work with the remaining accounts in the example string?

Answer 1

That took me ages to sort out in my head. The dictionaries are simple, one key which always has a list as a value - they're used to have a named list.

Inside the lists will be a string, or another dictionary (with a key with a list).

That means we can break up 'Assets:Bank:Car' and look for a dictionary in the root list matching {"Assets":[<whatever>]} or add one - and then jump to the [<whatever>] list two levels deeper. Next loop, look for a dictionary matching {"Bank":[<whatever>]} , or add one, jump to the [<whatever>] list two levels deeper. Keep doing that until we hit the last node Car . We must be on a list since we always jumped to an existing list or made a new one, so put Car in the current list.

NB. this approach would break if you had

Assets:Reserved
Assets:Reserved:Painting

but that would be a nonsense conflicting input, asking "Reserved" to be both leaf node and container, and in that situation you would only have:

Assets:Reserved:Painting

right?

data = """
Assets:Bank:Car
Assets:Bank:House
Assets:Savings:Emergency
Assets:Savings:Goals:Roof
Assets:Reserved
"""
J = []

for line in data.split('\n'):
    if not line: continue

    # split the line into parts, start at the root list
    # is there a dict here for this part?
    #   yes? cool, dive into it for the next loop iteration
    #   no? add one, with a list, ready for the next loop iteration
    #    (unless we're at the final part, then stick it in the list 
    #     we made/found in the previous loop iteration)

    parts = line.split(':')
    parent_list, current_list = J, J

    for index, part in enumerate(parts):
        for item in current_list:
            if part in item:
                parent_list, current_list = current_list, item[part]
                break
        else:
            if index == len(parts) - 1:
                # leaf node, add part as string
                current_list.append(part)
            else:
                new_list = []
                current_list.append({part:new_list})
                parent_list, current_list = current_list, new_list      

print J

->

[{'Assets': [{'Bank': ['Car', 'House']}, {'Savings': ['Emergency', {'Goals': ['Roof']}]}, 'Reserved']}]

Try online: https://repl.it/Ci5L

Create a tree from multiple nested dictionaries/lists in Python

Question

1 answers

solution1
3 ACCPTED 2016-08-02 06:10:55

Create a tree from multiple nested dictionaries/lists in Python

Question

1 answers

solution1 3 ACCPTED 2016-08-02 06:10:55

solution1
3 ACCPTED 2016-08-02 06:10:55