I have a data set of ~500 points in 2D, with given coordinates (also implying I can refer to each point with a single integer) (x,y) between 0 and 10. Now I'm trying to divide the area into regular square cells by applying a grid. Note that this process is beeing repeated in an algorithm and at some point there will be >>>500 square cells.
What I want to achieve: Loop over all points, for each point find the square cell in which the point lies and save this information.
A few steps later: Loop over all points again, for each point identify its cell and the adjacent cells of the cell. Take all the points of these cells and add them to eg a list, for further usage.
My thought process: Since there will be alot of empty cells and I do not want to waste memory for them, use a tree.
Example: In cell_39_41 and cell_39_42 is a point. First level: root-node with child 39
Second level: 39 node with children 41,42
Third level: 41 node with child point1 and 42 node with child point2
Fourth level: Nodes representing actual points
If I find more points in cell_39_41 or cell_39_42 they will be added as children of their respective third level nodes.
class Node(object):
def __init__(self, data):
self.data = data
self.children = []
def add_child(self, obj):
self.children.append(obj)
I left out an unrelevant method to return points in a cell.
Problems with this implementation:
1.If I add a second or third level node, I will have to refer to it to be able to add children or to find points in a certain cell and its adjacent cells. This means I have to do ALOT of costly linear searches since the children lists are not sorted.
2.I will be adding hundreds of nodes, but I need to able to refer to them by unique names. This might be a big personal fail, but I cannot think of a way to generate such names in a loop.
So I basically I'm pretty sure theres some mistake in my thought process or maybe the used implementation of a tree is not suitable. I have read alot of implementation of b-trees or similiar, but since this problem is limited to 2D I felt that they were just too much and not suited.
How about this ...
def add_point(data_dict, row, column, point):
# modifies source of data_dict in place, since dictionaries are mutable
data_dict.setdefault(row, {}).setdefault(column, []).append(point)
def get_table(data):
out_dict = {}
for row, column, point in data:
add_point(out_dict, row, column, point)
return out_dict
if __name__ == "__main__":
data = [(38, 41, 38411), (39, 41, 39411), (39, 42, 39421)]
points = get_table(data)
print points
add_point(points, 39, 42, 39422)
print points
Use dict of dicts as tree:
tree = {
'_data': 123,
'node1': {
'_data': 456,
'node11': {
'node111': {}
},
'node2': {
}
}
search in dicts are fast!
tree['node1']['node12']['node123']['_data'] = 123 # adding
unique names:
shortcuts = {}
shortcuts['name'] = tree['node1']['node11']['node111']
print shortcuts['name']['_data']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.