[英]How to optimize very large array construction with numpy or scipy
I am processing very large data sets on python 64 bits and need some help to optimize my interpolation code. 我正在处理python 64位上非常大的数据集,需要一些帮助来优化我的插值代码。
I am used to using numpy to avoid loops but here there are 2 loops that I can't find a way to avoid. 我习惯使用numpy来避免循环,但这里有2个循环,我找不到办法避免。
The main problem is also that the size of the arrays I need to compute gives a Memory Error when I use numpy so I switched to scipy sparse arrays which works but takes way too much time to compute the 2 left loops... 主要的问题是,当我使用numpy时,我需要计算的数组的大小给出了内存错误,所以我切换到scipy稀疏数组,它可以工作但是花费太多时间来计算2个左循环......
I tried to build iteratively my matrix using numpy.fromfunction but it won't run because the size of the array is too large. 我尝试使用numpy.fromfunction迭代构建我的矩阵,但它不会运行,因为数组的大小太大。
I have already read a lot of posts about building large arrays but the arrays that were asked about were too simple compared to what I have to build so the solutions don't work here. 我已经阅读了很多关于构建大型数组的帖子,但是被问到的数组与我必须构建的数组相比过于简单,所以解决方案在这里不起作用。
I cannot reduce the size of the data set, since it is a point cloud that I have already tiled in 10x10 tiles. 我无法减小数据集的大小,因为它是我已经在10x10平铺中平铺的点云。
Here is my interpolation code : 这是我的插值代码:
z_int = ss.dok_matrix((x_int.shape))
n,p = x_obs.shape
m = y_obs.shape[0]
a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
a2 = ss.coo_matrix( (3, 3), dtype=np.int64 )
a3 = ss.dok_matrix( (n, m))
a4 = ss.coo_matrix( (3, n), dtype=np.int64)
b = ss.vstack((z_obs, ss.coo_matrix( (3, 1), dtype=np.int64 ))).tocoo()
a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))
shape_a3 = a3.shape[0]
for l in np.arange(0, shape_a3):
for c in np.arange(0, shape_a3) :
if l == c:
a3[l, c] = rho
else:
a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])
a4 = a1.transpose()
a12 = ss.vstack((a1, a2))
a34 = ss.vstack((a3, a4))
a = ss.hstack((a12, a34)).tocoo()
x = spsolve(a, b)
for i in np.arange(0, z_int.shape[0]):
for j in np.arange(0, z_int.shape[0]):
z_int[i, j] = x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)
return z_int.todense()
where dist() is a function that computes distance, and phi is the following : dist()是计算距离的函数,phi如下:
return dist(dx, dy) ** 2 * np.log(dist(dx, dy))
I need the code to run faster and I am aware that it might be very badly written but I would like to learn how to write something more optimized to improve my coding skills. 我需要代码运行得更快,我知道它可能写得非常糟糕,但我想学习如何编写更优化的东西以提高我的编码技能。
That code is hard to follow, and understandably slow. 该代码很难遵循,并且可以理解为缓慢。 Iteration on sparse matrices is even slower than iteration on dense arrays.
稀疏矩阵的迭代甚至比密集阵列上的迭代慢。 I almost wish you'd started with a small working example using dense arrays, before worrying about making it work for the large case.
我几乎希望你从一个使用密集阵列的小工作示例开始,然后再担心它能够适用于大型案例。 I'm not going to try a comprehensive fix or speed up, just nibbles here and there.
我不会尝试全面修复或加速,只是在这里和那里啃。
This first a1
creation does nothing for you (except waste time). 第一个
a1
创建对你没有任何作用(浪费时间除外)。 Python is not a compiled language where you define the type of variables at the start. Python不是一种编译语言,您可以在其中定义变量的类型。
a1
after the second assignment is a sparse matrix because that's what hstack
created, not because of the previous coo
assignment. 第二次赋值之后的
a1
是稀疏矩阵,因为这是hstack
创建的,而不是因为先前的coo
赋值。
a1 = ss.coo_matrix( (n, 3), dtype=np.int64 )
...
a1 = ss.hstack((ss.coo_matrix(np.ones((n,p))), ss.coo_matrix(x_obs), ss.coo_matrix(y_obs)))
Initializing the dok
matrices, zint
and a3
is right, because you iterate to fill in values. 初始化
dok
矩阵, zint
和a3
是正确的,因为你迭代填充值。 But I like to see that kind of initialization closer to the loop, not way back at the top. 但我喜欢看到更接近循环的那种初始化,而不是回到顶部。 I would have used
lil
rather than dok
, but I'm not sure whether that's faster. 我会用
lil
而不是dok
,但我不确定这是否更快。
for l in np.arange(0, shape_a3):
for c in np.arange(0, shape_a3) :
if l == c:
a3[l, c] = rho
else:
a3[l, c] = phi(x_obs[l] - x_obs[c], y_obs[l] - y_obs[c])
The l==c
test identifies the main diagonal. l==c
测试识别主对角线。 There are ways of making diagonal matrices. 有制作对角矩阵的方法。 But it looks like you are setting all elements of
a3
. 但看起来你正在设置
a3
所有元素。 If so, why use the slower sparse approach? 如果是这样,为什么要使用较慢的稀疏方法?
What is phi
Does it require scalar inputs? 什么是
phi
它是否需要标量输入? x_obs[:,None]-x_obs
should give a (n,n) array directly. x_obs[:,None]-x_obs
应该直接给出(n,n)数组。
What does spsolve
produce? spsolve
产生什么? x
, sparse or dense. x
,稀疏或密集。 From your use in the z_int
loop it looks like a 1d dense array. 从你在
z_int
循环中的使用看起来像一个密集的数组。 It looks like you are setting all values of z_int
. 看起来您正在设置
z_int
所有值。
If phi
takes a (n,n) array, I think 如果
phi
采用(n,n)数组,我想
x[0] + x[1] * x_int[i, j] + x[2] * y_int[i, j] + np.sum(x[3:] * phi(x_int[i, j] - x_obs, y_int[i, j] - y_obs).T)
x[0] + x[1] * x_int + x[2] * y_int + np.sum(x[3:]) * phi(x_int-x_obs, y_int-y_obs).T)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.