真正矢量化 python 中的 numpy 数组的 function

Question

I have the following function, which takes two 1D numpy arrays q_i and q_j , does some calculation (including taking the norm of their difference) and returns a numpy array: I have the following function, which takes two 1D numpy arrays q_i and q_j , does some calculation (including taking the norm of their difference) and returns a numpy array:

import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_i - q_j)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i

Now I have a bunch of these q arrays stored in another numpy array t = [q1, q2, q3, ..., qn], and I want to evaluate the function coulomb for all unique pairs of q_i and q_j inside t (ie for ( q1 , q2 ), ( q1 , q3 ), ..., ( q1 , qn ), ( q2 , q3 ), ..., ( q2 , qn ), ... ( q_n-1 , qn )). Now I have a bunch of these q arrays stored in another numpy array t = [q1, q2, q3, ..., qn], and I want to evaluate the function coulomb for all unique pairs of q_i and q_j inside t (ie对于 ( q1 , q2 ), ( q1 , q3 ), ..., ( q1 , qn ), ( q2 , q3 ), ..., ( q2 , qn ), ... ( q_n-1 , qn )) .

Is there a way to vectorize this calculation (and I mean really vectorize it to boost performance, because np.vectorize is only a for -loop under the hood)?有没有办法向量化这个计算（我的意思是真的向量化它以提高性能，因为np.vectorize只是一个for引擎盖下的循环）？

My current solution is a nested for -loop, which is far from optimal performance-wise:我当前的解决方案是一个嵌套for循环，这远非最佳性能：

for i, _ in enumerate(t):
   for j, _ in enumerate(t[i+1:]):
      f = coulomb(t[i], t[j])
      ...

Answer 1

There is a trade off between memory and speed when you want to vectorise these type numpy problems,当您想要向量化这些类型的 numpy 问题时，memory 和速度之间存在权衡，

Loops over numpy array is slow but there is not much of memory requirement.在 numpy 阵列上循环很慢，但对 memory 的要求并不多。 If you want to vectorize, you have to duplicate and thereby create excess memory and pass it to numpy functions.如果要矢量化，则必须复制并因此创建多余的 memory 并将其传递给 numpy 函数。

One way of vectorizing your function is矢量化 function 的一种方法是

import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i[np.newaxis,:, :] - q_j[:,np.newaxis, :] #broadcasting, therefore creating more data
    q_ij_norm = lin.norm(q_ij, axis=2) # this can be zero when qi = qj
    f_q_i = (k * c1 * c2 / (q_ij_norm ** 3)[:,:,np.newaxis]) * q_ij #broadcasting again
    return f_q_i # this returns a 3D array with shape (i,j,dim_of_q)

Answer 2

here 3 posible solutions, the last one, is a little caothic but uses a vectorization to calculate n q vs one.这里有 3 个可能的解决方案，最后一个，有点拘谨，但使用矢量化来计算n q vs one。 Also is the fastests也是最快的

from itertools import combinations
import numpy as np
import numpy.linalg as lin

def coulomb(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_ij)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i    

def coulomb2(q_i, q_j, c1=0.8, c2=0.2, k=0.5):
    """
    Parameters
    ----------
    q_i : numpy.ndarray
    q_j : numpy.ndarray
    c1 : float
    c2 : float
    k : float

    Returns
    -------
        numpy.ndarray
    """
    
    q_ij = q_i - q_j
    q_ij_norm = lin.norm(q_ij,axis=1).reshape(-1,1)
    f_q_i = (k * c1 * c2 / q_ij_norm ** 3) * q_ij
    return f_q_i    



q = np.random.randn(500,10)
from itertools import combinations
from time import time



t1= time()
v = []
for i in range(q.shape[0]):
    for j in range(i+1,q.shape[0]):
        
            v.append([coulomb(q[i], q[j])])

t2= time()

combs = combinations(range(len(q)), 2)
vv =[]
for i,j in combs:
    vv.append([coulomb(q[i], q[j])])

t3 = time()
vvv = []
for i in  range(q.shape[0]):

        vvv += list(coulomb2(q[i], q[i+1:]))
t4 = time()

print(t2-t1)
print(t3-t2)
print(t4-t3)

#0.9133327007293701
#1.0843684673309326
#0.04461050033569336

``

真正矢量化 python 中的 numpy 数组的 function

问题描述

2 个解决方案

解决方案1
0 2021-12-09 21:35:44

解决方案2
0 2021-12-09 21:47:21

真正矢量化 python 中的 numpy 数组的 function

问题描述

2 个解决方案

解决方案1 0 2021-12-09 21:35:44

解决方案2 0 2021-12-09 21:47:21

解决方案1
0 2021-12-09 21:35:44

解决方案2
0 2021-12-09 21:47:21