简体   繁体   中英

Traversing of arbitrary order tree

I know this has been asked before and I've seen the answers but still can't figure out what is happening.

I'm trying to conditionally build folder structures based on certain metadata of files (dates and locations) and a set of conditions. For example, for testing I'm using these:

COND = ["Y", "m", "C"]

Which means that in the folder structure files need to first split files by year, then calendar month, then country of origin.

This is the example data I created for testing:

data = [
    ["111", dt.datetime(2019, 1, 1), "Aus", "Bri"],
    ["112", dt.datetime(2019, 1, 5), "Aus", "Bri"],
    ["113", dt.datetime(2019, 2, 10), "Aus", "Mel"],
    ["114", dt.datetime(2020, 1, 1), "Aus", "Per"],
    ["115", dt.datetime(2020, 1, 10), "Aus", "Per"],
    ["116", dt.datetime(2020, 1, 25), "Aus", "Per"],
    ["117", dt.datetime(2020, 10, 5), "My", "KL"],
    ["118", dt.datetime(2020, 11, 6), "Ru", "Led"],
    ["119", dt.datetime(2020, 12, 1), "Ru", "Mos"],
    ["120", dt.datetime(2021, 3, 5), "Aus", "Syd"],
    ["121", dt.datetime(2021, 5, 1), "Aus", "Mel"],
    ["122", dt.datetime(2021, 6, 1), "Aus", "Per"],
    ["123", dt.datetime(2021, 11, 1), "Chi", "Bei"],
    ["124", dt.datetime(2021, 11, 15), "Jp", "Tok"],
    ["125", dt.datetime(2022, 1, 1), "Aus", "Per"],
    ["126", dt.datetime(2022, 3, 1), "Aus", "Bri"],
    ["127", dt.datetime(2022, 3, 5), "Aus", "Per"],
    ["128", dt.datetime(2022, 3, 11), "My", "KL"],
    ["129", dt.datetime(2022, 5, 1), "Aus", "Syd"],
    ["130", dt.datetime(2022, 8, 8), "Aus", "Bri"],
]

And these simple functions perform filtering:

def filter_year(data: list[list[str | dt.datetime]]) -> list[int]:
    return {i[1].year for i in data}


def filter_month(data: list[list[str | dt.datetime]]) -> list[int]:
    return {i[1].month for i in data}


def filter_day(data: list[list[str | dt.datetime]]) -> list[int]:
    return {i[1].day for i in data}


def filter_country(data: list[list[str | dt.datetime]]) -> list[str]:
    return {i[2] for i in data}


def filter_city(data: list[list[str | dt.datetime]]) -> list[str]:
    return {i[3] for i in data}

condition_dict = {
    "Y": {'fun': filter_year, 'id': 1 },
    "m": {'fun': filter_month,'id': 1 },
    "d": {'fun': filter_day,'id': 1},
    "C": {'fun': filter_country, 'id': 2},
    "c": {'fun': filter_city, 'id': 3 }

I'm trying to build structure automatically using an arbitrary order tree. The splitting of data at the Node works correctly:

from typing import Any
from pathlib import Path
from dataclasses import dataclass, field

@dataclass
class Node:
    folder: Path
    metadata: list[list[Any]] = field(default_factory=list)
    conditions: list[str] = field(default_factory=list)
    
    @property
    def children(self) -> list['Node']:
        if len(self.conditions) == 0:
            return []
        current_condition = self.conditions[0]
        fun = condition_dict[current_condition]['fun']
        
        fnames: list[int | str] = fun(self.metadata)
        children_data = {str(n): {} for n in fnames}
        for f in fnames:
            children_data[str(f)]['folder'] = self.folder / str(f)    
            children_data[str(f)]['conditions'] = self.conditions[1:]   
            if current_condition == 'Y':
                children_data[str(f)]['metadata'] = [i for i in self.metadata if i[1].year == f]
            elif current_condition == 'm':
                children_data[str(f)]['metadata'] = [i for i in self.metadata if i[1].month == f]
            elif current_condition == 'd':
                children_data[str(f)]['metadata'] = [i for i in self.metadata if i[1].day == f]    
            elif current_condition == 'C':
                children_data[str(f)]['metadata'] = [i for i in self.metadata if i[2] == f]
            elif current_condition == 'c':
                children_data[str(f)]['metadata'] = [i for i in self.metadata if i[3] == f]
        
        return [Node(**i) for i in children_data.values()]

Now, I'm trying to traverse the tree for which I used a modified version from the answer here ( Traverse Non-Binary Tree )

@dataclass
class Tree:
    def traverse(self, root: Node):
        r = root.children
        if not r or len(root.conditions) == 0:
            print('The end of subtree:', root.folder)
        else:
            for child in r:            
                print('\n'.join(str(i.folder) for i in r))
                if isinstance(child, Node):
                    for x in self.traverse(child):
                        print(str(x.folder))
                else:
                    print(child) 

But when I try with my data after a few correct outputs I always run into errors NoneType is not iterable :

n = Node(folder=Path('/home'), metadata=data, conditions=COND)

tree = Tree()
tree.traverse(n)

Output:

/home/2019
/home/2020
/home/2021
/home/2022
/home/2019/1
/home/2019/2
/home/2019/1/Aus
The end of subtree: /home/2019/1/Aus
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in <cell line: 4>()
      1 n = Node(folder=Path('/home'), metadata=data, conditions=COND)
      3 tree = Tree()
----> 4 tree.traverse(n)

/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in Tree.traverse(self, root)
     45 print('\n'.join(str(i.folder) for i in r))
     46 if isinstance(child, Node):
---> 47     for x in self.traverse(child):
     48         print(str(x.folder))
     49 else:

/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in Tree.traverse(self, root)
     45 print('\n'.join(str(i.folder) for i in r))
     46 if isinstance(child, Node):
---> 47     for x in self.traverse(child):
     48         print(str(x.folder))
     49 else:

/home/pavel/python/photo_manager/temp/tree_test.ipynb Cell 4 in Tree.traverse(self, root)
     45 print('\n'.join(str(i.folder) for i in r))
     46 if isinstance(child, Node):
---> 47     for x in self.traverse(child):
     48         print(str(x.folder))
     49 else:

TypeError: 'NoneType' object is not iterable

I don't understand why this is happening as I believe I guarded against NoneType. For some reason I'm only getting to the end of one subtree but not traversing the others. What am I doing wrong here?

I didn't really follow the whole story, but the error you get on this line is expected:

 for x in self.traverse(child):

The thing is that self.traverse doesn't have a return statement so this recursive call returns None , and for x in None makes no sense.

I think you actually don't want to get some x values from that recursive call, since that recursive call takes care of its own business. There is no need to print again what is already printed by that recursive call.

There is a second issue here:

        for child in r:            
            print('\n'.join(str(i.folder) for i in r))

Here, for each child in r , you iterate r again in the print call. That will just print duplicates. You need to just print the current child from r . And that would make the else block below it obsolete: when you just have printed child.folder it seems unnecessary to print child again.

So correcting both issues, the following at least runs without error:

@dataclass
class Tree:
    def traverse(self, root: Node):
        r = root.children
        if not r or len(root.conditions) == 0:
            print('The end of subtree:', root.folder)
        else:
            for child in r:
                print(str(child.folder))
                if isinstance(child, Node):
                    self.traverse(child)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM