简体   繁体   中英

Create a tree from multiple nested dictionaries/lists in Python

Preface : To help explain why I am doing this, I will explain the end goal. Essentially I have a list of accounts that are defined in a very specific syntax. Here are some examples:

Assets:Bank:Car
Assets:Bank:House
Assets:Savings:Emergency
Assets:Savings:Goals:Roof
Assets:Reserved

As can be seen above, an account can have any number of parents and children. The end goal is to parse the above accounts into a tree structure in Python that will be used for providing account auto-completion in the Sublime Text Editor (ie, if I typed Assets: and then queried for auto-complete, I would be presented with a list as such: Bank, Savings, Reserved )

The Result: Using the account list from the preface, the desired result in Python would look something like below:

[  
   {  
      "Assets":[  
         {  
            "Bank":[  
               "Car",
               "House"
            ]
         },
         {  
            "Savings":[  
               "Emergency",
               {  
                  "Goals":[  
                     "Roof"
                  ]
               }
            ]
         },
         "Reserved"
      ]
   }
]

Half-Solution: I was able to get two basic accounts to get added together using recursion. This works for adding these two: Assets:Bank:Car and Assets:Bank:House . However, once they start to differ it starts to fall apart and the recursion gets messy, so I'm not sure if it's the best way.

import re

def parse_account(account_str):
    subs = account_str.split(":")

    def separate(subs):
        if len(subs) == 1:
            return subs
        elif len(subs):
            return [{subs[0]: separate(subs[1:])}]

    return separate(subs)

def merge_dicts(a, b):
    # a will be a list with dictionaries and text values and then nested lists/dictionaries/text values
    # b will always be a list with ONE dictionary or text value

    key = b[0].keys()[0] # this is the dictionary key of the only dictionary in the b list

    for item in a: # item is a dictionary or a text value
        if isinstance(item, dict): # if item is a dictionary
            if key in item:
                # Is the value a list with a dict or a list with a text value
                if isinstance(b[0][key][0], str):
                    # Extend the current list with the new value
                    item[key].extend(b[0][key])
                else:
                    # Recurse to the next child
                    merge_dicts(item[key], b[0][key])
            else:


    return a

# Accounts have an "open [name]" syntax for defining them
text = "open Assets:Bank:Car\nopen Assets:Bank:House\nopen Assets:Savings:Emergency\nopen Assets:Savings:Goals:Roof\nopen Assets:Reserved"
EXP = re.compile("open (.*)")
accounts = EXP.findall(text) # This grabs all accounts

# Create a list of all the parsed accounts
dicts = []
for account in accounts:
    dicts.append(parse_account(account))

# Attempt to merge two accounts together
final = merge_dicts(dicts[0], dicts[1])
print final

# In the future we would call: reduce(merge_dicts, dicts) to merge all accounts

I could be going about this in the completely wrong way and I would be interested in differing opinions. Otherwise, does anyone have insight into how to make this work with the remaining accounts in the example string?

That took me ages to sort out in my head. The dictionaries are simple, one key which always has a list as a value - they're used to have a named list.

Inside the lists will be a string, or another dictionary (with a key with a list).

That means we can break up 'Assets:Bank:Car' and look for a dictionary in the root list matching {"Assets":[<whatever>]} or add one - and then jump to the [<whatever>] list two levels deeper. Next loop, look for a dictionary matching {"Bank":[<whatever>]} , or add one, jump to the [<whatever>] list two levels deeper. Keep doing that until we hit the last node Car . We must be on a list since we always jumped to an existing list or made a new one, so put Car in the current list.

NB. this approach would break if you had

Assets:Reserved
Assets:Reserved:Painting

but that would be a nonsense conflicting input, asking "Reserved" to be both leaf node and container, and in that situation you would only have:

Assets:Reserved:Painting

right?

data = """
Assets:Bank:Car
Assets:Bank:House
Assets:Savings:Emergency
Assets:Savings:Goals:Roof
Assets:Reserved
"""
J = []

for line in data.split('\n'):
    if not line: continue

    # split the line into parts, start at the root list
    # is there a dict here for this part?
    #   yes? cool, dive into it for the next loop iteration
    #   no? add one, with a list, ready for the next loop iteration
    #    (unless we're at the final part, then stick it in the list 
    #     we made/found in the previous loop iteration)

    parts = line.split(':')
    parent_list, current_list = J, J

    for index, part in enumerate(parts):
        for item in current_list:
            if part in item:
                parent_list, current_list = current_list, item[part]
                break
        else:
            if index == len(parts) - 1:
                # leaf node, add part as string
                current_list.append(part)
            else:
                new_list = []
                current_list.append({part:new_list})
                parent_list, current_list = current_list, new_list      

print J

->

[{'Assets': [{'Bank': ['Car', 'House']}, {'Savings': ['Emergency', {'Goals': ['Roof']}]}, 'Reserved']}]

Try online: https://repl.it/Ci5L

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM