Let's imagine an empty NumPy array of 3x4 where you've got the coordinate of the top-left corner and the step size in horizontal and vertical direction. Now I would like to know the coordinates for the middle of each cell for the whole array. Like this:
For this I implemented a nested for-loop.
In [12]:
import numpy as np
# extent(topleft_x, stepsize_x, 0, topleft_y, 0, stepsize_y (negative since it's top-left)
extent = (5530000.0, 5000.0, 0.0, 807000.0, 0.0, -5000.0)
array = np.zeros([3,4],object)
cols = array.shape[0]
rows = array.shape[1]
# function to apply to each cell
def f(x,y):
return x*extent[1]+extent[0]+extent[1]/2, y*extent[5]+extent[3]+extent[5]/2
# nested for-loop
def nestloop(cols,rows):
for col in range(cols):
for row in range(rows):
array[col,row] = f(col,row)
In [13]:
%timeit nestloop(cols,rows)
100000 loops, best of 3: 17.4 µs per loop
In [14]:
array.T
Out[14]:
array([[(5532500.0, 804500.0), (5537500.0, 804500.0), (5542500.0, 804500.0)],
[(5532500.0, 799500.0), (5537500.0, 799500.0), (5542500.0, 799500.0)],
[(5532500.0, 794500.0), (5537500.0, 794500.0), (5542500.0, 794500.0)],
[(5532500.0, 789500.0), (5537500.0, 789500.0), (5542500.0, 789500.0)]], dtype=object)
But eager to learn, how can I optimize this? I was thinking of vectorizing or using lambda. I tried to vectorize it as follow:
array[:,:] = np.vectorize(check)(cols,rows)
ValueError: could not broadcast input array from shape (2) into shape (3,4)
But, than I got a broadcasting error. Currently the array is 3 by 4, but this also can become 3000 by 4000.
Surely the way you are computing the x
and y
coordinates is highly inefficient because it's not vectorized at all. You can do:
In [1]: import numpy as np
In [2]: extent = (5530000.0, 5000.0, 0.0, 807000.0, 0.0, -5000.0)
...: x_steps = np.array([0,1,2]) * extent[1]
...: y_steps = np.array([0,1,2,3]) * extent[-1]
...:
In [3]: x_coords = extent[0] + x_steps + extent[1]/2
...: y_coords = extent[3] + y_steps + extent[-1]/2
...:
In [4]: x_coords
Out[4]: array([ 5532500., 5537500., 5542500.])
In [5]: y_coords
Out[5]: array([ 804500., 799500., 794500., 789500.])
At this point the coordinates of the points are given by the cartesian product()
of these two arrays:
In [5]: list(it.product(x_coords, y_coords))
Out[5]: [(5532500.0, 804500.0), (5532500.0, 799500.0), (5532500.0, 794500.0), (5532500.0, 789500.0), (5537500.0, 804500.0), (5537500.0, 799500.0), (5537500.0, 794500.0), (5537500.0, 789500.0), (5542500.0, 804500.0), (5542500.0, 799500.0), (5542500.0, 794500.0), (5542500.0, 789500.0)]
You just have to group them 4 by 4.
To obtain the product with numpy
you can do (based on this answer):
In [6]: np.transpose([np.tile(x_coords, len(y_coords)), np.repeat(y_coords, len(x_coords))])
Out[6]:
array([[ 5532500., 804500.],
[ 5537500., 804500.],
[ 5542500., 804500.],
[ 5532500., 799500.],
[ 5537500., 799500.],
[ 5542500., 799500.],
[ 5532500., 794500.],
[ 5537500., 794500.],
[ 5542500., 794500.],
[ 5532500., 789500.],
[ 5537500., 789500.],
[ 5542500., 789500.]])
Which can be reshaped:
In [8]: product.reshape((3,4,2)) # product is the result of the above
Out[8]:
array([[[ 5532500., 804500.],
[ 5537500., 804500.],
[ 5542500., 804500.],
[ 5532500., 799500.]],
[[ 5537500., 799500.],
[ 5542500., 799500.],
[ 5532500., 794500.],
[ 5537500., 794500.]],
[[ 5542500., 794500.],
[ 5532500., 789500.],
[ 5537500., 789500.],
[ 5542500., 789500.]]])
If this is not the order you want you can do something like:
In [9]: ar = np.zeros((3,4,2), float)
...: ar[0] = product[::3]
...: ar[1] = product[1::3]
...: ar[2] = product[2::3]
...:
In [10]: ar
Out[10]:
array([[[ 5532500., 804500.],
[ 5532500., 799500.],
[ 5532500., 794500.],
[ 5532500., 789500.]],
[[ 5537500., 804500.],
[ 5537500., 799500.],
[ 5537500., 794500.],
[ 5537500., 789500.]],
[[ 5542500., 804500.],
[ 5542500., 799500.],
[ 5542500., 794500.],
[ 5542500., 789500.]]])
I believe there are better ways to do this last reshaping, but I'm not a numpy
expert.
Note that using object
as dtype it's a huge performance penalty, since numpy
cannot optimize anything (and is sometimes slower than using normal list
s). I have used a (3,4,2)
array instead which allows faster operations.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.