简体   繁体   中英

Python: Calculating the Path from Parent Child Relationships

Parent) as csv (700,000 rows) input

Child   Parent
fA00    f0
fA9 fA0
fA31    fA0
fA30    fA0
fA1 fA00
dccfA1  fA00
fA2 fA00
fA3 fA00
fA01    fA00
fA4 fA00
fA5 fA00
fA6 fA00
fA7 fA00
fA0 fA00
fA142149    fA00
fA02    fA00
fA8 fA00
qA1 fA10
fA22    fA10
fA23    fA10
fA11    fA10
qA2     fA10
fA15    fA11
fA13    fA11
fA12    fA11
fA14    fA13
fA17    fA16
fA18    fA17
fA19    fA17
fA20    fA17
fA21    fA19
etc....

It goes up to 14 levels deep. The top parent is f0

I want to iterate through the child parent relationships to determine the path

Expected Result

f0 --- top
f0\fa00
f0\fa00\.Child
f0\fa00\.Child2etc
f0\fA0
f0\fA0\.Child
f0\fA0\.Child2etc

How can I do this in Python?

I started out thinking complicated recursive construction of tree structures but basically it is very simple. Create a mapping of child to parent then starting at a child list its parent then the parent's parent up to the top. A recursive routine extracts the child's ancestry easily.

'''
This is the family tree:
------------------------
f0:
    a0:
        b0
        b1:
        b2:
    a1:
        b3:
        b4:
    a2:
        b5:
            c0
            c1
'''
ancestry = [
    ('b1', 'a0'),
    ('c1', 'b5'),
    ('b2', 'a0'),
    ('b3', 'a1'),
    ('b4', 'a1'),
    ('b5', 'a2'),
    ('a0', 'f0'),
    ('a1', 'f0'),
    ('a2', 'f0'),
    ('b0', 'a0'),
    ('c0', 'b5'),
]

And the code is:

parents = set()
children = {}
for c,p in ancestry:
    parents.add(p)
    children[c] = p

# recursively determine parents until child has no parent
def ancestors(p):
    return (ancestors(children[p]) if p in children else []) + [p]

# for each child that has no children print the geneology
for k in (set(children.keys()) - parents):
    print '/'.join(ancestors(k))

Output is:

f0/a1/b4
f0/a0/b0
f0/a0/b1
f0/a0/b2
f0/a1/b3
f0/a2/b5/c1
f0/a2/b5/c0

I'll leave it as an exercise to read the csv file , and maybe sort the outputs better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM