I'm doing some work on node trees and I'm stuck with this issue. This list contains all the information of a tree:
connections = ['Module/Expr/ListComp/BinOp/Name/id/i/',
'Module/Expr/ListComp/BinOp/Sub/',
'Module/Expr/ListComp/BinOp/Num/0.5/',
'Module/Expr/ListComp/comprehension/Name/id/i/',
'Module/Expr/ListComp/comprehension/Name/id/inp/']
I need to convert this into:
{'Module':'Expr', 'Expr':'ListComp', 'ListComp':'BinOp comprehension',
'BinOp':'Name Sub Num', 'Name':'id', 'id':'i', 'Num':'0.5',
'comprehension':'Name', 'Name':'id', 'id':'i inp'}
The goal is to parse the connections into a dictionary of structure {'parent':'child(s)'}
. In order to do this I have already tried this:
rules = {}
connections_list = [[word for word in path.split("/") if word] for path in connections]
for path in connections_list:
for i, word in enumerate(path):
same_level = [y[i+1] for y in connections_list if len(connections_list) > i+1]
if same_level:
unique_on_level = list(set(same_level))
rules.update({word:" ".join(unique_on_level)})
else:
pass
break
print(rules)
With an output:
{'Module': 'Expr',
'Expr': 'ListComp',
'ListComp': 'BinOp comprehension',
'BinOp': 'Num Sub Name'}
I can't figure out a way of doing this, the issue here happens around the last nodes but I don't know how to solve it, any idea about how to fix this?
First create a mapping of parent to children nodes, and then remove the dupes.
rules = {}
for connection in connections:
parts = connection.rstrip("/").split("/")
for parent, child in zip(parts, parts[1:]):
if parent not in rules:
rules[parent] = []
rules[parent].append(child)
rules = {k: " ".join({}.fromkeys(v)) for k, v in rules.items()}
Based on @wim 's answer and the comments, I think this should work:
from collections import defaultdict
rule_data = defaultdict(set)
for connection in connections:
parts = connection.rstrip("/").split("/")
for level, (parent, child) in enumerate(zip(parts, parts[1:])):
rule_data[level, parent].add(child)
rules = [
(parent, " ".join(sorted(children)))
for (_level, parent), children in rule_data.items()
]
Notes:
Using set
discards the order of the children; if it's important, we can instead use a dict
(or, for compatibility with older versions of Python, OrderedDict
):
rule_data = defaultdict(dict)
rule_data[level, parent][child] = None
(parent, " ".join(children))
I sort the children for stability of the output, so that unit tests can work easily and so any downstream processing doesn't see spurious changes.
As @Prune noted, this doesn't seem like a natural representation for the data:
rule_data
intermediate variable here may be more useful in further processing than the final form...connections = ['Module/Expr/ListComp/BinOp/Name/id/i/',
'Module/Expr/ListComp/BinOp/Sub/',
'Module/Expr/ListComp/BinOp/Num/0.5/',
'Module/Expr/ListComp/comprehension/Name/id/i/',
'Module/Expr/ListComp/comprehension/Name/id/inp/']
splitted_conns = [conn.strip('/').split('/') for conn in connections]
res = {}
for conn in splitted_conns:
for root, child in zip(conn[:-1], conn[1:]):
res[root] = res.get(root, set()) | {child}
print(res)
output:
{'Module': {'Expr'}, 'Expr': {'ListComp'}, 'ListComp': {'BinOp', 'comprehension'}, 'BinOp': {'Sub', 'Num', 'Name'}, 'Name': {'id'}, 'id': {'inp', 'i'}, 'Num': {'0.5'}, 'comprehension': {'Name'}}
Something like this? Your expected output contains duplicate node "id", I suposse that it's a mistake, it's doesn't?
About the solution, I've decided use a set for each group un children, so we can avoid fails with nodes that contain a space in its name. Moreover, with sets is easy to avoid duplicates. If you want to convert each set in a space separated string, yo can try this:
{k: ' '.join(v) for k, v in res.items()}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.