简体   繁体   中英

How to optimize very large array construction with numpy or scipy

I am processing very large data sets on python 64 bits and need some help to optimize my interpolation code.

I am used to using numpy to avoid loops but here there are 2 loops that I can't find a way to avoid.

The main problem is also that the size of the arrays I need to compute gives a Memory Error when I use numpy so I switched to scipy sparse arrays which works but takes way too much time to compute the 2 left loops...

I tried to build iteratively my matrix using numpy.fromfunction but it won't run because the size of the array is too large.

I have already read a lot of posts about building large arrays but the arrays that were asked about were too simple compared to what I have to build so the solutions don't work here.

I cannot reduce the size of the data set, since it is a point cloud that I have already tiled in 10x10 tiles.

Here is my interpolation code :

z_int = ss.dok_matrix((x_int.shape))

n,p = x_obs.shape
m = y_obs.shape[0]
a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
a2 = ss.coo_matrix( (3, 3), dtype=np.int64 )
a3 = ss.dok_matrix( (n, m))
a4 = ss.coo_matrix( (3, n), dtype=np.int64)

b = ss.vstack((z_obs, ss.coo_matrix( (3, 1), dtype=np.int64 ))).tocoo()

a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))

shape_a3 = a3.shape[0]

for l in np.arange(0, shape_a3):
    for c in np.arange(0, shape_a3) :
        if l == c:
            a3[l, c] = rho
        else:
            a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])


a4 = a1.transpose()

a12 = ss.vstack((a1, a2))
a34 = ss.vstack((a3, a4))
a = ss.hstack((a12, a34)).tocoo()


x = spsolve(a, b)


for i in np.arange(0, z_int.shape[0]):
    for j in np.arange(0, z_int.shape[0]):
        z_int[i, j] = x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)


return z_int.todense()

where dist() is a function that computes distance, and phi is the following :

return dist(dx, dy) ** 2 * np.log(dist(dx, dy))

I need the code to run faster and I am aware that it might be very badly written but I would like to learn how to write something more optimized to improve my coding skills.

That code is hard to follow, and understandably slow. Iteration on sparse matrices is even slower than iteration on dense arrays. I almost wish you'd started with a small working example using dense arrays, before worrying about making it work for the large case. I'm not going to try a comprehensive fix or speed up, just nibbles here and there.

This first a1 creation does nothing for you (except waste time). Python is not a compiled language where you define the type of variables at the start. a1 after the second assignment is a sparse matrix because that's what hstack created, not because of the previous coo assignment.

a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
...
a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))

Initializing the dok matrices, zint and a3 is right, because you iterate to fill in values. But I like to see that kind of initialization closer to the loop, not way back at the top. I would have used lil rather than dok , but I'm not sure whether that's faster.

for l in np.arange(0, shape_a3):
    for c in np.arange(0, shape_a3) :
        if l == c:
            a3[l, c] = rho
        else:
            a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])

The l==c test identifies the main diagonal. There are ways of making diagonal matrices. But it looks like you are setting all elements of a3 . If so, why use the slower sparse approach?

What is phi Does it require scalar inputs? x_obs[:,None]-x_obs should give a (n,n) array directly.

What does spsolve produce? x , sparse or dense. From your use in the z_int loop it looks like a 1d dense array. It looks like you are setting all values of z_int .

If phi takes a (n,n) array, I think

x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)

x[0] + x[1] * x_int + x[2] * y_int +  np.sum(x[3:]) * phi(x_int-x_obs, y_int-y_obs).T)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM