简体   繁体   中英

How can i select equal distances points from a set of points using python

Assume a list of points or nodes. each one of them has xy and z coordinates . the distance between two points i and j equal D(i,j)= sqrt((xi-xj)^2+(yi-yj)^2+(zi-zj)^2) . Here I got 400000 data points.

Now, I want to select a set of these nodes which have equal distances between them (the inter-distance is specified previously --> 0.05). Hence the selected points are uniformly distributed.

If run with a while loop, it takes approx 3h to complete the entire data set. Looking for fastest method.

no_rows = len(df)
i = 1
while i < no_rows:
    a1 = df.iloc[i-1, 1]
    a2 = df.iloc[i, 1]
    b1 = df.iloc[i-1, 2]
    b2 = df.iloc[i, 2]
    c1 = df.iloc[i-1, 3]
    c2 = df.iloc[i, 3]
    dist = np.round(((a2-a1)**2+(b2-b1)**2+(c2-c1)**2)**0.5,5)
    df.iloc[i, 6]= dist
    if dist < 0.05000:
                df = df.drop(i)
                df.reset_index(drop = True, inplace = True)
                no_rows = len(df)
                i = i-1
    i+=1

EDIT

An option would be to use directly pandas and merging the dataframe over itself. Something like :

import pandas as pd
import numpy as np

df = pd.DataFrame([
  [131.404866,16.176877,128.120177 ], 
  [131.355045,16.176441,128.115972 ], 
  [131.305224,16.176005,128.111767 ], 
  [131.255403,16.175569,128.107562 ], 
  [131.205582,16.175133,128.103357 ], 
  [131.158858,16.174724,128.099413 ], 
  [131.15576,16.174702,128.09916 ], 
  [131.105928,16.174342,128.095089 ], 
  [131.05988,16.174009,128.091328 ], 
  [131.056094,16.173988,128.09103 ], 
  [131.006249,16.173712,128.087107 ], 
  [130.956404,16.173436,128.083184], 
  ],
  columns=['x', 'y', 'z']
)
df.reset_index(drop=False, inplace=True)

dist = 0.05
df['CROSS'] = 1
df = df.merge(df, on="CROSS")
df.reset_index(drop=True, inplace=True)

df['distance'] = np.round(
    np.sqrt(
      np.square(df['x_x'] - df['x_y'])
      + np.square(df['y_x']-df['y_y'])
      + np.square(df['z_x']-df['z_y'])
    ),
    5
)

#drop values where distances are = 0 (same points)
ix = df[df.distance==0].index 
df.drop(ix, inplace=True)


print('These are all pair of points which are matching the distance', dist)
ix = df[df.distance.astype(float)==dist].index
df.sort_values('distance', inplace=True)
print(df.loc[ix])
print('-'*50)


points = pd.DataFrame(
        df.loc[ix, ['index_x', 'x_x', 'y_x', 'z_x']].values.tolist() 
        + df.loc[ix, ['index_y', 'x_y', 'y_y', 'z_y']].values.tolist(), 
        columns=['index', 'x', 'y', 'z'])
points.drop_duplicates(keep='first', inplace=True)
print('These are all the points which have another at distance', dist)
print(points)

Numpy's function are way faster than any loop and will allow you to treat the whole dataset at the same time.

Another could be to use geopandas (it can be very fast also, but I'm not sure this would be the case here : the fastest method involves pyproj's distance computation (written in C) and I don't think there is any declination in 3D)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM