简体   繁体   English

numpy:从子矩阵广播

[英]Numpy: Broadcasting from submatrix

Given two 2D arrays: 给定两个2D数组:

A =[[1, 1, 2, 2],
    [1, 1, 2, 2],
    [3, 3, 4, 4],
    [3, 3, 4, 4]]

B =[[1, 2],
    [3, 4]]

A - B = [[ 0, -1,  1,  0],
         [-2, -3, -1, -2],
         [ 2,  1,  3,  2],
         [ 0, -1,  1,  0]]

B's shape is 2,2, A's is 4,4. B的形状为2,2,A的形状为4,4。 I want to perform a broadcast subtraction of B over A: A - B. 我想对A进行广播减法B:A-B。

I specifically want to use broadcasting as the array sizes I am dealing with are very large (8456,8456). 我特别想使用广播,因为我要处理的阵列大小非常大(8456,8456)。 I am hoping that broadcasting will provide a small performance optimization. 我希望广播将提供一个小的性能优化。

I've tried reshaping the arrays but with no luck, and am stumped. 我试过重塑数组,但没有运气,而且很困惑。 Scikit is not available to me to use. 我无法使用Scikit。

You can expand B by tiling it twice in both dimensions: 您可以通过将B在两个维度上平铺两次来展开B

print A - numpy.tile(B, (2, 2))

yields 产量

[[ 0 -1  1  0]
 [-2 -3 -1 -2]
 [ 2  1  3  2]
 [ 0 -1  1  0]]

However for big matrices this may create a lot of overhead in RAM. 但是,对于大型矩阵,这可能会在RAM中产生大量开销。

Alternatively you can view A in blocks using Scikit Image's skimage.util.view_as_blocks and modify it in place 或者,您可以使用Scikit Image的skimage.util.view_as_blocks 块查看A并在适当位置进行修改

Atmp = skimage.util.view_as_blocks(A, block_shape=(2, 2))
Atmp -= B

print A

which will result, without needlessly repeating B 这将导致,而无需不必要地重复B

[[ 0 -1  1  0]
 [-2 -3 -1 -2]
 [ 2  1  3  2]
 [ 0 -1  1  0]]

Approach #1 : Here's an approach using strides that uses the concept of views without making actual copies to then perform subtraction from A and as such should be quite efficient - 方法#1:这是一种使用strides的方法,该方法使用views的概念而无需制作实际副本即可从A减去,因此应该非常有效-

m,n = B.strides
m1,n1 = A.shape
m2,n2 = B.shape
s1,s2 = m1//m2, n1//n2
strided = np.lib.stride_tricks.as_strided         
out = A - strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)

Sample run - 样品运行-

In [78]: A
Out[78]: 
array([[29, 53, 30, 25, 92, 10],
       [ 2, 20, 35, 87,  0,  9],
       [46, 30, 20, 62, 79, 63],
       [44,  9, 78, 33,  6, 40]])

In [79]: B
Out[79]: 
array([[35, 60],
       [21, 86]])

In [80]: m,n = B.strides
    ...: m1,n1 = A.shape
    ...: m2,n2 = B.shape
    ...: s1,s2 = m1//m2, n1//n2
    ...: strided = np.lib.stride_tricks.as_strided
    ...: 

In [81]: # Replicated view
    ...: strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Out[81]: 
array([[35, 60, 35, 60, 35, 60],
       [21, 86, 21, 86, 21, 86],
       [35, 60, 35, 60, 35, 60],
       [21, 86, 21, 86, 21, 86]])

In [82]: A - strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Out[82]: 
array([[ -6,  -7,  -5, -35,  57, -50],
       [-19, -66,  14,   1, -21, -77],
       [ 11, -30, -15,   2,  44,   3],
       [ 23, -77,  57, -53, -15, -46]])

Approach #2 : We can just reshape both A and B to 4D shapes with B having two singleton dimensions along which its elements would be broadcasted when subtracted from 4D version of A . 方法2:我们可以将ABreshape4D形状,其中B具有两个单例尺寸,当从A 4D版本中减去其元素时,将沿其broadcasted After subtraction, we reshape back to 2D for final output. 减去后,我们将重新调整为2D形状以进行最终输出。 Thus, we would have an implementation, like so - 因此,我们将有一个实现,像这样-

m1,n1 = A.shape
m2,n2 = B.shape
out = (A.reshape(m1//m2,m2,n1//n2,n2) - B.reshape(1,m2,1,n2)).reshape(m1,n1)

This should work if A has dimentions that are multiple of B's dimentions: 如果A的尺寸是B的尺寸的倍数,则此方法应该起作用:

A - np.tile(B, (int(A.shape[0]/B.shape[0]), int(A.shape[1]/B.shape[1])))

And the result: 结果:

array([[ 0, -1,  1,  0],
       [-2, -3, -1, -2],
       [ 2,  1,  3,  2],
       [ 0, -1,  1,  0]])

If you do not want to tile, you can reshape A to extract (2, 2) blocks, and use broadcasting to substract B: 如果不想平铺,则可以调整A的形状以提取(2, 2)个块,并使用广播减去B:

C = A.reshape(A.shape[0]//2, 2, A.shape[1]//2, 2).swapaxes(1, 2)
C - B
array([[[[ 0, -1],
     [-2, -3]],

    [[ 1,  0],
     [-1, -2]]],


   [[[ 2,  1],
     [ 0, -1]],

    [[ 3,  2],
     [ 1,  0]]]])

And then swap the axis back and reshape: 然后向后交换轴并重塑形状:

(C - B).swapaxes(1, 2).reshape(A.shape[0], A.shape[1])

This should be significantly faster, since C is a view on A, not a constructed array. 这应该快得多,因为C是A上的视图,而不是构造的数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM