简体   繁体   English

3维数据挖掘性能

[英]3 dimension data mining performance

I've some data with 3 dimensional positions. 我有一些3维位置的数据。

# max size of grid (x, y, z)
grid_size = (999, 999, 99)

class MyObject(object):
  def __init__(self, id):
    self.id = id
    self.trace = []

[...]
# objects have some positions in their "trace"
print(myobject1.trace)
[(65, 128, 12), (66, 128, 12), (66, 129, 12)]
print(myobject2.trace)
[(456, 255, 75), (456, 254, 75), (456, 254, 74)]

I need to create a map with position of all of these object. 我需要创建一个包含所有这些对象位置的地图。 The goal is to found the most performance way to found objects in this map. 目标是找到在该地图中找到对象的最有效方法。 Exemple, i have a list of X coordinates: What are objects peresent in these coordonates ? 例如,我有一个X坐标列表:这些齿形中存在哪些对象?

So i thought about four strategy: 所以我考虑了四种策略:

One dimensional dict with string key : 带字符串键的一维字典

{'65.128.12':myobject1, '66.128.12':myobject1, '66.129.12':myobject1, 
 '456.255.75':myobject2, '456.254.75':myobject2, '456.254.74':myobject2}

def find_in_str_map(search_points, map_str):
  found_objects = []
  for trace_point in search_points:
    key = str(trace_point[0])+'.'+str(trace_point[1])+'.'+str(trace_point[2])
    if key in map_str:
      if map_str[key].id != myobject.id:
        found_objects.append(map_str[key])
  return found_objects

One dimensional dict with int key : 具有int键的一维字典

{6512812:myobject1, 6612812:myobject1, 6612912:myobject1, 
 45625575:myobject2, 45625475:myobject2, 45625474:myobject2}

def find_in_int_map(search_points, map_str):
  found_myobjects = []
  for trace_point in search_points:
    key = trace_point[0]*100000+trace_point[1]*100+trace_point[2]
    if key in map_str:
      if map_str[key].id != myobject.id:
        found_myobjects.append(map_str[key])
  return found_myobjects

One dimensional dict with tuple (coordonate) key : 具有元组(密码)键的一维字典

{(65, 128, 12):myobject1, (66, 128, 12):myobject1, (66, 129, 12):myobject1, 
 (456, 255, 75):myobject2, (456, 254, 75):myobject2, (456, 254, 74):myobject2}

def find_in_tuple_map(search_points, map):
  found_myobjects = []
  for trace_point in search_points:
    if trace_point in map:
      if map[trace_point].id != myobject.id:
        found_objects.append(map[trace_point])
  return found_objects

Three dimensional dict 三维字典

{456: {254: {74: myobject2, 75: myobject2}, 255: {75: myobject2}}, 65: {128: {12: myobject1}}, 66: {128: {12: myobject1}, 129: {12: myobject1}}}

def find_in_3d_map(search_points, map):
  founds_myobjects = []
  for trace_point in search_points:
    x = trace_point[0]
    y = trace_point[1]
    z = trace_point[2]
    if x in map:
      if y in map[x]:
        if z in map[x][y]:
          founds_myobjects.append(map[x][y][z])
  return founds_myobjects

So, i test performance of these strategys with timeit (and large number of objects): 因此,我使用timeit(以及大量对象)测试了这些策略的性能:

print('str', timeit.timeit('find_in_str_map(bugs, map_str)', number=10, [...]
print('int', timeit.timeit('find_in_int_map(bugs, map_int)', number=10, [...]
print('3d ', timeit.timeit('find_in_3d_map(bugs, map_3d)', number=10, [...]
print('tup', timeit.timeit('find_in_tuple_map(bugs, map_tuple)', number=10, [...]

(Testable code here: http://pastebin.com/FfkeEw9U ) (此处的可测试代码: http : //pastebin.com/FfkeEw9U

Results are: 结果是:

python2.7 : python2.7

('str', 8.213999032974243)
('int', 5.6337010860443115)
('3d ', 6.18729305267334)
('tup', 5.0934319496154785)

python3.3 : python3.3

str 10.11169655699996
int 5.984578157000215
3d  6.448565245998907
tup 5.139268291999542

Does exist other strategy to stock and mine in a map of 3d coordinates collection ? 在3D坐标收集地图中是否存在其他库存和开采策略? My 3 presenteds strategys are optimizable ? 我的3个演示策略是否可以优化?

The easiest way will be to use your coordinary tuple as the key of your map. 最简单的方法是将协调元组用作地图的键。

{(65,128,12):myobject1, (66,128,12):myobject1, (66,129,12):myobject1, 
 (456,255,75):myobject2, (456,254,75):myobject2, (456,254,74):myobject2}    

def find_collisions_tuple_map(bugs, map):
  collisions_bugs = []
  for bug in bugs:
    for trace_point in bug.get_possibles_future_trace_point():
      if trace_point in map:
      collisions_bugs.append(map[trace_point])
  return collisions_bugs

On my computer, it's slightly faster 在我的电脑上,速度稍快

('str', 10.188277582443057)
('int', 7.133011876243648)
('3d ', 7.486879201843017)
('tuple ', 6.406966607422291)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM