如何用numpy或scipy优化非常大的阵列结构

Question

I am processing very large data sets on python 64 bits and need some help to optimize my interpolation code. 我正在处理python 64位上非常大的数据集，需要一些帮助来优化我的插值代码。

I am used to using numpy to avoid loops but here there are 2 loops that I can't find a way to avoid. 我习惯使用numpy来避免循环，但这里有2个循环，我找不到办法避免。

The main problem is also that the size of the arrays I need to compute gives a Memory Error when I use numpy so I switched to scipy sparse arrays which works but takes way too much time to compute the 2 left loops... 主要的问题是，当我使用numpy时，我需要计算的数组的大小给出了内存错误，所以我切换到scipy稀疏数组，它可以工作但是花费太多时间来计算2个左循环......

I tried to build iteratively my matrix using numpy.fromfunction but it won't run because the size of the array is too large. 我尝试使用numpy.fromfunction迭代构建我的矩阵，但它不会运行，因为数组的大小太大。

I have already read a lot of posts about building large arrays but the arrays that were asked about were too simple compared to what I have to build so the solutions don't work here. 我已经阅读了很多关于构建大型数组的帖子，但是被问到的数组与我必须构建的数组相比过于简单，所以解决方案在这里不起作用。

I cannot reduce the size of the data set, since it is a point cloud that I have already tiled in 10x10 tiles. 我无法减小数据集的大小，因为它是我已经在10x10平铺中平铺的点云。

Here is my interpolation code : 这是我的插值代码：

z_int = ss.dok_matrix((x_int.shape))

n,p = x_obs.shape
m = y_obs.shape[0]
a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
a2 = ss.coo_matrix( (3, 3), dtype=np.int64 )
a3 = ss.dok_matrix( (n, m))
a4 = ss.coo_matrix( (3, n), dtype=np.int64)

b = ss.vstack((z_obs, ss.coo_matrix( (3, 1), dtype=np.int64 ))).tocoo()

a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))

shape_a3 = a3.shape[0]

for l in np.arange(0, shape_a3):
    for c in np.arange(0, shape_a3) :
        if l == c:
            a3[l, c] = rho
        else:
            a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])


a4 = a1.transpose()

a12 = ss.vstack((a1, a2))
a34 = ss.vstack((a3, a4))
a = ss.hstack((a12, a34)).tocoo()


x = spsolve(a, b)


for i in np.arange(0, z_int.shape[0]):
    for j in np.arange(0, z_int.shape[0]):
        z_int[i, j] = x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)


return z_int.todense()

where dist() is a function that computes distance, and phi is the following : dist（）是计算距离的函数，phi如下：

return dist(dx, dy) ** 2 * np.log(dist(dx, dy))

I need the code to run faster and I am aware that it might be very badly written but I would like to learn how to write something more optimized to improve my coding skills. 我需要代码运行得更快，我知道它可能写得非常糟糕，但我想学习如何编写更优化的东西以提高我的编码技能。

Answer 1

That code is hard to follow, and understandably slow. 该代码很难遵循，并且可以理解为缓慢。 Iteration on sparse matrices is even slower than iteration on dense arrays. 稀疏矩阵的迭代甚至比密集阵列上的迭代慢。 I almost wish you'd started with a small working example using dense arrays, before worrying about making it work for the large case. 我几乎希望你从一个使用密集阵列的小工作示例开始，然后再担心它能够适用于大型案例。 I'm not going to try a comprehensive fix or speed up, just nibbles here and there. 我不会尝试全面修复或加速，只是在这里和那里啃。

This first a1 creation does nothing for you (except waste time). 第一个a1创建对你没有任何作用（浪费时间除外）。 Python is not a compiled language where you define the type of variables at the start. Python不是一种编译语言，您可以在其中定义变量的类型。 a1 after the second assignment is a sparse matrix because that's what hstack created, not because of the previous coo assignment. 第二次赋值之后的a1是稀疏矩阵，因为这是hstack创建的，而不是因为先前的coo赋值。

a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
...
a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))

Initializing the dok matrices, zint and a3 is right, because you iterate to fill in values. 初始化dok矩阵， zint和a3是正确的，因为你迭代填充值。 But I like to see that kind of initialization closer to the loop, not way back at the top. 但我喜欢看到更接近循环的那种初始化，而不是回到顶部。 I would have used lil rather than dok , but I'm not sure whether that's faster. 我会用lil而不是dok ，但我不确定这是否更快。

for l in np.arange(0, shape_a3):
    for c in np.arange(0, shape_a3) :
        if l == c:
            a3[l, c] = rho
        else:
            a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])

The l==c test identifies the main diagonal. l==c测试识别主对角线。 There are ways of making diagonal matrices. 有制作对角矩阵的方法。 But it looks like you are setting all elements of a3 . 但看起来你正在设置a3所有元素。 If so, why use the slower sparse approach? 如果是这样，为什么要使用较慢的稀疏方法？

What is phi Does it require scalar inputs? 什么是phi它是否需要标量输入？ x_obs[:,None]-x_obs should give a (n,n) array directly. x_obs[:,None]-x_obs应该直接给出（n，n）数组。

What does spsolve produce? spsolve产生什么？ x , sparse or dense. x ，稀疏或密集。 From your use in the z_int loop it looks like a 1d dense array. 从你在z_int循环中的使用看起来像一个密集的数组。 It looks like you are setting all values of z_int . 看起来您正在设置z_int所有值。

If phi takes a (n,n) array, I think 如果phi采用（n，n）数组，我想

x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)

x[0] + x[1] * x_int + x[2] * y_int +  np.sum(x[3:]) * phi(x_int-x_obs, y_int-y_obs).T)

如何用numpy或scipy优化非常大的阵列结构

问题描述

1 个解决方案

解决方案1
0 2019-06-14 16:17:17

如何用numpy或scipy优化非常大的阵列结构

问题描述

1 个解决方案

解决方案1 0 2019-06-14 16:17:17

解决方案1
0 2019-06-14 16:17:17