[英]The most efficient way to sum all possible pairs (x_ik, y_j) for a given k?
I have two numpy array x
with shape (n,m)
and y
with shape (p,)
.我有两个 numpy 数组x
形状(n,m)
和y
形状(p,)
。 I would like to sum all possible pairs x[k, i]
and y[j]
to create a new numpy array z
with shape (n, m*p)
.我想将所有可能的对x[k, i]
和y[j]
相加,以创建一个新的 numpy 数组z
,其形状为(n, m*p)
。
A naïve algorithm would be:一个朴素的算法是:
import numpy as np
# some code
z = np.empty((n, m*p))
for k in range(n):
for i in range(m):
for j in range(p):
z[k, i + m * j] = x[k, i] + y[j]
This algorithm has a polynomial complexity: O(n*m*p)
Knowing I am working on array with $n ~ 1e6$ I am looking a for a more efficient algorithm using the power of numpy and/or pandas.该算法具有多项式复杂度: O(n*m*p)
知道我正在处理 $n ~ 1e6$ 的数组,我正在寻找一种更有效的算法,使用 numpy 和/或 pandas 的力量。
I have done some research and I found a possible solution: Efficient way to sum all possible pairs我做了一些研究,找到了一个可能的解决方案: Efficient way to sum all possible pairs
But it does not fit with my specific problem, I mean I can use it but it will still not be pythonic as I would iterate with one loop (the solution is reusable without much effort for n=1).但它不适合我的具体问题,我的意思是我可以使用它,但它仍然不是 Pythonic,因为我会用一个循环进行迭代(对于 n=1,该解决方案无需太多努力即可重用)。
As others have said in the comments, not improving on the complexity but making use of vectorization and memory contiguity:正如其他人在评论中所说,没有提高复杂性,而是利用矢量化和 memory 连续性:
np.add.outer(X,y).reshape(len(X), -1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.