Fast performance array processing in Numpy/Python

Question

I am trying to find out the optimal way (fastest performance) to process coordinate and measurement data stored in several numpy arrays.

I need to calculate the distance from each grid point (lot, lon, alt value in green in the attached image) to each measurement location (lat, lon, alt, range from target in gray in the attached image). Seeing as there are hundreds of grid points, and thousands of measurement ranges to calculate for each grid point, I would like to iterate through the arrays in the most efficient way possible

在此输入图像描述

I am trying to decide between how to store the LLA measurements for the grid and measurements, and then what the ideal way is to calculate the Mean Squared Error for each point on the grid based on the delta between the measured range value and the actual range.

Any ideas on how to best store these values, and then iterate across the grid to determine the range from each measurement would be very much appreciated. Thanks!!!

Currently, I am using a 2D meshgrid to store the LLA values for the grid

# Create a 2D Grid that will be used to store the MSE estimations
# First, create two 1-D arrays representing the X and Y coordinates of our grid
x_delta = abs(xmax-xmin)/gridsize_x
y_delta = abs(ymax-ymin)/gridsize_y
X = np.arange(xmin,xmax+x_delta,x_delta)
Y = np.arange(ymin,ymax+y_delta,y_delta)

# Next, pass arrays to meshgrid to return 2-D coordinate matrices from the 1-D coordinate arrays
grid_lon, grid_lat = np.meshgrid(X, Y)

I have the LLA points and range values from the measurements stored in a measurement class

measurement_lon = [measurement.gps.getlon() for measurement in target_measurements]
measurement_lat = [measurement.gps.getlat() for measurement in target_measurements]
measurement_range = [measurement.getrange() for measurement in target_measurements]

Measurement class

class RangeMeasurement:

def __init__(self, lat, lon, alt, range):
  self.gps = GpsLocation(lat,lon,alt)
  self.range = range

Really bad pseudocode for range calculation (iterative and very slow)

for i in len(grid_lon):
  for j in len(measurement_lat):
    range_error += distance(grid_lon[i],grid_lat[i],measurement_lon[j],measurement_lat[j])-measurement_range[j]

Answer 1

I think the scipy.spatial.distance module will help you out with this problem: http://docs.scipy.org/doc/scipy/reference/spatial.distance.html

You should store your points as 2-d numpy arrays with 2 columns and N rows, where N is the number of points in the array. To convert your grid_lon and grid_lat to this format, use

N1 = grid_lon.size
grid_point_array = np.hstack([grid_lon.reshape((N1,1)), grid_lat.reshape((N1,1))])

This takes all of the values in grid_lon, which are arranged in a rectangular array that is the same shape as the grid, and puts them in an array with one column and N rows. It does the same for grid_lat. The two one-column wide arrays are then combined to create a two column array.

A similar method can be used to convert your measurement data:

N2 = len(measurement_lon)
measurment_data_array = np.hstack([np.array(measurement_lon).reshape((N2,1)),
    np.array(measurement_lat).reshape((N2,1))])

Once your data is in this format, you can easily find the distances between each pair of points with scipy.spatial.distance:

d = scipy.spatial.distance.cdist(grid_point_array, measurement_data_array, 'euclidean')

d will be an array with N1 rows and N2 columns, and d[i,j] will be the distance between grid point i and measurement point j.

EDIT Thanks for clarifying range error. Sounds like an interesting project. This should give you the grid point with the smallest accumulated squared error:

measurement_range_array = np.array(measurement_range)
flat_grid_idx = pow(measurement_range_array-d,2).sum(1).argmin()

This takes advantage of broadcasting to get the difference between a point's measured range and its distance from every grid point. All of the errors for a given grid point are then summed, and the resulting 1-D array should be the accumulated error you're looking for. argmin() is called to find the position of the smallest value. To get the x and y grid coordinates from the flattened index, use

grid_x = flat_grid_idx % gridsize_x
grid_y = flat_grid_idx // gridsize_x

(The // is integer division.)

Fast performance array processing in Numpy/Python

Question

1 answers

solution1
3 ACCPTED 2011-12-06 22:50:53

Fast performance array processing in Numpy/Python

Question

1 answers

solution1 3 ACCPTED 2011-12-06 22:50:53

solution1
3 ACCPTED 2011-12-06 22:50:53