简体   繁体   English

Python将网格应用于2D数据,将非空网格单元格保存在树中

[英]Python apply grid on 2D data, save non empty grid cells in tree

I have a data set of ~500 points in 2D, with given coordinates (also implying I can refer to each point with a single integer) (x,y) between 0 and 10. Now I'm trying to divide the area into regular square cells by applying a grid. 我有一个2D的〜500个点的数据集,具有给定的坐标(也暗示我可以使用0到10之间的一个整数来引用每个点)(x,y)。现在,我尝试将区域划分为常规通过应用网格将正方形单元格化。 Note that this process is beeing repeated in an algorithm and at some point there will be >>>500 square cells. 请注意,此过程正在算法中重复进行,并且在某个点上将有>>> 500个正方形单元。

What I want to achieve: Loop over all points, for each point find the square cell in which the point lies and save this information. 我要实现的目标:遍历所有点,为每个点找到该点所在的方形单元格并保存此信息。
A few steps later: Loop over all points again, for each point identify its cell and the adjacent cells of the cell. 几个步骤之后:再次遍历所有点,为每个点标识其单元格和该单元格的相邻单元格。 Take all the points of these cells and add them to eg a list, for further usage. 取这些单元格的所有点并将其添加到列表中,以备将来使用。

My thought process: Since there will be alot of empty cells and I do not want to waste memory for them, use a tree. 我的思考过程:由于会有很多空单元格,并且我不想浪费它们的内存,因此请使用树。
Example: In cell_39_41 and cell_39_42 is a point. 示例:在cell_39_41和cell_39_42中是一个点。 First level: root-node with child 39 第一层:带有子节点39的根节点
Second level: 39 node with children 41,42 第二级:39个节点,带孩子41,42
Third level: 41 node with child point1 and 42 node with child point2 第三级:具有子点1的41个节点和具有子点2的42个节点
Fourth level: Nodes representing actual points 第四级:代表实际点的节点
If I find more points in cell_39_41 or cell_39_42 they will be added as children of their respective third level nodes. 如果我在cell_39_41或cell_39_42中找到更多点,它们将被添加为它们各自的第三级节点的子代。

class Node(object):

def __init__(self, data):
    self.data = data
    self.children = []

def add_child(self, obj):
    self.children.append(obj)

I left out an unrelevant method to return points in a cell. 我省略了一个无关的方法来返回单元格中的点。

Problems with this implementation: 此实现存在的问题:
1.If I add a second or third level node, I will have to refer to it to be able to add children or to find points in a certain cell and its adjacent cells. 1.如果添加第二或第三级节点,则必须引用该节点才能添加子级或在某个单元格及其相邻单元格中找到点。 This means I have to do ALOT of costly linear searches since the children lists are not sorted. 这意味着我必须进行大量昂贵的线性搜索,因为子列表没有排序。
2.I will be adding hundreds of nodes, but I need to able to refer to them by unique names. 2.我将添加数百个节点,但是我需要能够使用唯一的名称来引用它们。 This might be a big personal fail, but I cannot think of a way to generate such names in a loop. 这可能是个人的重大失败,但是我无法想到一种在循环中生成此类名称的方法。

So I basically I'm pretty sure theres some mistake in my thought process or maybe the used implementation of a tree is not suitable. 因此,我基本上可以确定自己的思考过程中存在一些错误,或者使用的树实现不合适。 I have read alot of implementation of b-trees or similiar, but since this problem is limited to 2D I felt that they were just too much and not suited. 我已经阅读了很多关于b树或类似树的实现方法,但是由于此问题仅限于2D,因此我觉得它们太过复杂而且不合适。

How about this ... 这个怎么样 ...

def add_point(data_dict, row, column, point):
    # modifies source of data_dict in place, since dictionaries are mutable
    data_dict.setdefault(row, {}).setdefault(column, []).append(point)

def get_table(data):
    out_dict = {}
    for row, column, point in data:
        add_point(out_dict, row, column, point)
    return out_dict


if __name__ == "__main__":
    data = [(38, 41, 38411), (39, 41, 39411), (39, 42, 39421)]
    points = get_table(data)    
    print points    
    add_point(points, 39, 42, 39422)    
    print points

Use dict of dicts as tree: 使用dict的dict作为树:

tree = {
    '_data': 123,
    'node1': {
        '_data': 456,
        'node11': {
           'node111': {}
        },
    'node2': {
    }
}

search in dicts are fast! 在字典中搜索很快!

tree['node1']['node12']['node123']['_data'] = 123 # adding

unique names: 唯一名称:

shortcuts = {}
shortcuts['name'] = tree['node1']['node11']['node111']
print shortcuts['name']['_data']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM