简体   繁体   English

如何用numpy或scipy优化非常大的阵列结构

[英]How to optimize very large array construction with numpy or scipy

I am processing very large data sets on python 64 bits and need some help to optimize my interpolation code. 我正在处理python 64位上非常大的数据集,需要一些帮助来优化我的插值代码。

I am used to using numpy to avoid loops but here there are 2 loops that I can't find a way to avoid. 我习惯使用numpy来避免循环,但这里有2个循环,我找不到办法避免。

The main problem is also that the size of the arrays I need to compute gives a Memory Error when I use numpy so I switched to scipy sparse arrays which works but takes way too much time to compute the 2 left loops... 主要的问题是,当我使用numpy时,我需要计算的数组的大小给出了内存错误,所以我切换到scipy稀疏数组,它可以工作但是花费太多时间来计算2个左循环......

I tried to build iteratively my matrix using numpy.fromfunction but it won't run because the size of the array is too large. 我尝试使用numpy.fromfunction迭代构建我的矩阵,但它不会运行,因为数组的大小太大。

I have already read a lot of posts about building large arrays but the arrays that were asked about were too simple compared to what I have to build so the solutions don't work here. 我已经阅读了很多关于构建大型数组的帖子,但是被问到的数组与我必须构建的数组相比过于简单,所以解决方案在这里不起作用。

I cannot reduce the size of the data set, since it is a point cloud that I have already tiled in 10x10 tiles. 我无法减小数据集的大小,因为它是我已经在10x10平铺中平铺的点云。

Here is my interpolation code : 这是我的插值代码:

z_int = ss.dok_matrix((x_int.shape))

n,p = x_obs.shape
m = y_obs.shape[0]
a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
a2 = ss.coo_matrix( (3, 3), dtype=np.int64 )
a3 = ss.dok_matrix( (n, m))
a4 = ss.coo_matrix( (3, n), dtype=np.int64)

b = ss.vstack((z_obs, ss.coo_matrix( (3, 1), dtype=np.int64 ))).tocoo()

a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))

shape_a3 = a3.shape[0]

for l in np.arange(0, shape_a3):
    for c in np.arange(0, shape_a3) :
        if l == c:
            a3[l, c] = rho
        else:
            a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])


a4 = a1.transpose()

a12 = ss.vstack((a1, a2))
a34 = ss.vstack((a3, a4))
a = ss.hstack((a12, a34)).tocoo()


x = spsolve(a, b)


for i in np.arange(0, z_int.shape[0]):
    for j in np.arange(0, z_int.shape[0]):
        z_int[i, j] = x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)


return z_int.todense()

where dist() is a function that computes distance, and phi is the following : dist()是计算距离的函数,phi如下:

return dist(dx, dy) ** 2 * np.log(dist(dx, dy))

I need the code to run faster and I am aware that it might be very badly written but I would like to learn how to write something more optimized to improve my coding skills. 我需要代码运行得更快,我知道它可能写得非常糟糕,但我想学习如何编写更优化的东西以提高我的编码技能。

That code is hard to follow, and understandably slow. 该代码很难遵循,并且可以理解为缓慢。 Iteration on sparse matrices is even slower than iteration on dense arrays. 稀疏矩阵的迭代甚至比密集阵列上的迭代慢。 I almost wish you'd started with a small working example using dense arrays, before worrying about making it work for the large case. 我几乎希望你从一个使用密集阵列的小工作示例开始,然后再担心它能够适用于大型案例。 I'm not going to try a comprehensive fix or speed up, just nibbles here and there. 我不会尝试全面修复或加速,只是在这里和那里啃。

This first a1 creation does nothing for you (except waste time). 第一个a1创建对你没有任何作用(浪费时间除外)。 Python is not a compiled language where you define the type of variables at the start. Python不是一种编译语言,您可以在其中定义变量的类型。 a1 after the second assignment is a sparse matrix because that's what hstack created, not because of the previous coo assignment. 第二次赋值之后的a1是稀疏矩阵,因为这是hstack创建的,而不是因为先前的coo赋值。

a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
...
a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))

Initializing the dok matrices, zint and a3 is right, because you iterate to fill in values. 初始化dok矩阵, zinta3是正确的,因为你迭代填充值。 But I like to see that kind of initialization closer to the loop, not way back at the top. 但我喜欢看到更接近循环的那种初始化,而不是回到顶部。 I would have used lil rather than dok , but I'm not sure whether that's faster. 我会用lil而不是dok ,但我不确定这是否更快。

for l in np.arange(0, shape_a3):
    for c in np.arange(0, shape_a3) :
        if l == c:
            a3[l, c] = rho
        else:
            a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])

The l==c test identifies the main diagonal. l==c测试识别主对角线。 There are ways of making diagonal matrices. 有制作对角矩阵的方法。 But it looks like you are setting all elements of a3 . 但看起来你正在设置a3所有元素。 If so, why use the slower sparse approach? 如果是这样,为什么要使用较慢的稀疏方法?

What is phi Does it require scalar inputs? 什么是phi它是否需要标量输入? x_obs[:,None]-x_obs should give a (n,n) array directly. x_obs[:,None]-x_obs应该直接给出(n,n)数组。

What does spsolve produce? spsolve产生什么? x , sparse or dense. x ,稀疏或密集。 From your use in the z_int loop it looks like a 1d dense array. 从你在z_int循环中的使用看起来像一个密集的数组。 It looks like you are setting all values of z_int . 看起来您正在设置z_int所有值。

If phi takes a (n,n) array, I think 如果phi采用(n,n)数组,我想

x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)

x[0] + x[1] * x_int + x[2] * y_int +  np.sum(x[3:]) * phi(x_int-x_obs, y_int-y_obs).T)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM