简体   繁体   English

有没有办法向量化 numpy 数组的多项式展开?

[英]Is there a way to vectorize polynomial expansion of a numpy array?

Main idea is to have a feature expansion of the elements in an array by having a polynomial expansion of the elements by taking the array to a polynomial power.主要思想是通过将数组乘以多项式幂来对元素进行多项式展开,从而对数组中的元素进行特征展开。 Let's say I have a numpy array with two elements, [x, y]->[x, y, x^2, y^2, x y] or [x, y, z] ->[x, y, z, x^2, y^2, z^2, x y, x z, y z] for the power two.假设我有一个 numpy 数组,其中包含两个元素 [x, y]->[x, y, x^2, y^2, x y] 或 [x, y, z] ->[x, y, z , x^2, y^2, z^2, x y, x z, y z] 为二的幂。 I am able to solve this using itertools combinations but it is quite slow (half a minute or so for an array of 100k~ elements. Is there a way to vectorize this somehow to improve the speed?我可以使用 itertools 组合来解决这个问题,但是速度很慢(对于 100k~ 元素的数组大约需要半分钟左右。有没有办法以某种方式对其进行矢量化以提高速度?

You can use numpy.broadcast :您可以使用numpy.broadcast

arr = np.array([2,3])

exponents = np.arange(4) + 1

(arr**exponents[:,None]).ravel()

Output: Output:

array([ 2,  3,  4,  9,  8, 27, 16, 81])

If I understand the requirement properly.如果我正确理解要求。 It's possible to use numpy index arrays.可以使用 numpy 索引 arrays。 Access elements of an array using an array of indices.使用索引数组访问数组的元素。

 a = np.array([ 5, 7., 1 ])
 #              x, y, One                                                        

 a[[ 0, 1, 0, 1, 0 ]]         # Note double brackets                                                        
 # array([5., 7., 5., 7., 5.])

 a[[ 2, 2, 0, 1, 1 ]]         # Note double brackets
 # array([1., 1., 5., 7., 7.])

 a[[ 0, 1, 0, 1, 0 ]] * a[[ 2, 2, 0, 1, 1 ]]                                            
 # array([ 5.,  7., 25., 49., 35.])
 #         x    y  x**2 y**2  x*y

If it's not possible to add a 1 to the input array.如果无法将 1 添加到输入数组。

b = np.ones(5) 
b[2:]=a[[0,1,1]]

a[[ 0, 1, 0, 1, 0 ]] * b                                                              
# array([ 5.,  7., 25., 49., 35.])

Both of these approaches could be generalised to additional dimensions.这两种方法都可以推广到其他维度。 I assume this must be a requirement if there's a run time issue.如果存在运行时问题,我认为这必须是一个要求。

Edit:编辑:

np.random.seed( 1234 )                                                                    

arr = np.random.randint( 1, 10, size = ( 3, 2))*1.0                                       
arr                                                                                       
# array([[4., 7.],
#        [6., 5.],
#        [9., 2.]])

def expand( arr ): 
    result = np.zeros( ( arr.shape[0], 5 ) ) 
    result[ : , 0:2 ] = arr      # Set first two columns to x and y
    result[ :, 2: ] = a[ :, [ 0, 1, 0 ]] * a[:, [ 0, 1, 1 ]] 
    # Set results in columns 2 to 5

    return result 

expand( arr )                                                                             
# array([[ 4.,  7., 16., 49., 28.],
#        [ 6.,  5., 36., 25., 30.],
#        [ 9.,  2., 81.,  4., 18.]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM