简体   繁体   English

Numpy:为什么“a + = aT”不起作用?

[英]Numpy: Why doesn't 'a += a.T' work?

As stated in scipy lecture notes, this will not work as expected: 如scipy讲义中所述,这将无法按预期工作:

a = np.random.randint(0, 10, (1000, 1000))
a += a.T
assert np.allclose(a, a.T)

But why? 但为什么? How does being a view affect this behavior? 视图如何影响此行为?

This problem is due to internal designs of numpy. 这个问题是由于numpy的内部设计。 It basically boils down to that the inplace operator will change the values as it goes, and then those changed values will be used where you were actually intending for the original value to be used. 它基本上归结为inplace运算符将随之改变值,然后这些更改的值将用于您实际打算使用原始值的位置。

This is discussed in this bug report , and it does not seem to be fixable. 这个bug报告对此进行了讨论,它似乎不可修复

The reason why it works for smaller size arrays seems to be because of how the data is buffered while being worked on. 它适用于较小尺寸阵列的原因似乎是因为在处理数据时如何缓冲数据。

To exactly understand why the issue crops up, I am afraid you will have to dig into the internals of numpy. 为了准确理解问题突然出现的原因,我恐怕你必须深入研究numpy的内部。

a += a.T

does in-place summing up (it's using view aT while processing), so you end up with a non-symmetric matrix you can easily check this, ie I got: 就地汇总(它是在处理时使用视图aT),所以你最终得到一个非对称矩阵,你可以很容易地检查这一点,即我得到:

In [3]: a
Out[3]: 
array([[ 6, 15,  7, ...,  8,  9,  2],
       [15,  6,  9, ..., 14,  9,  7],
       [ 7,  9,  0, ...,  9,  5,  8],
       ..., 
       [ 8, 23, 15, ...,  6,  4, 10],
       [18, 13,  8, ...,  4,  2,  6],
       [ 3,  9,  9, ..., 16,  8,  4]])

You can see it's not symmetric, right? 你可以看到它不对称,对吧? (compare right-top and left-bottom items) (比较右上角和左下角的项目)

if you do a real copy: 如果你做一个真正的副本:

a += np.array(a.T)

it works fine, ie: 它工作正常,即:

In [6]: a
Out[6]: 
array([[ 2, 11,  8, ...,  9, 15,  5],
       [11,  4, 14, ..., 10,  3, 13],
       [ 8, 14, 14, ..., 10,  9,  3],
       ..., 
       [ 9, 10, 10, ..., 16,  7,  6],
       [15,  3,  9, ...,  7, 14,  1],
       [ 5, 13,  3, ...,  6,  1,  2]])

To better understand why it does so, you can imagine you wrote the loop yourself as following: 为了更好地理解它为什么这样做,你可以想象你自己编写循环如下:

In [8]: for i in xrange(1000):
            for j in xrange(1000):
                a[j,i] += a[i,j]
   ....:         

In [9]: a
Out[9]: 
array([[ 4,  5, 14, ..., 12, 16, 13],
       [ 3,  2, 16, ..., 16,  8,  8],
       [ 9, 12, 10, ...,  7,  7, 23],
       ..., 
       [ 8, 10,  6, ..., 14, 13, 23],
       [10,  4,  6, ...,  9, 16, 21],
       [11,  8, 14, ..., 16, 12, 12]])

It adds a[999,0] to calculate a[0,999] but a[999,0] already has sum of a[999,0] + a[0,999] -- so below the main diagonal you add the values twice. 它增加了一个[999,0]来计算[0,999],但是[999,0]已经有一个[999,0] + a [0,999]的总和 - 所以在主对角线以下你加两次值。

assert np.allclose(a, aT) 断言np.allclose(a,aT)

I just understood now that you are generating a symmetric matrix by summing a with it's transponse aT resulting in a symmetric matrix) 我现在才明白你通过将a与它的aT相加a一个对称矩阵来产生一个对称矩阵,从而得到一个对称矩阵)

Which makes us rightfully expect np.allclose(a, aT) to return true (resulting matrix being symmetric so it should be equal to its transpose) 这使得我们理所当然地期望np.allclose(a, aT)返回true(结果矩阵是对称的,所以它应该等于它的转置)

a += aT # How being a view affects this behavior? a + = aT#视图如何影响此行为?

I've just narrowed it down to this 我只是把它缩小到这个范围

TL;DR a = a + aT is fine for larger matrices, while a += aT gives strange results starting from 91x91 size TL; DR a = a + aT适用于较大的矩阵,而a += aT从91x91大小开始给出奇怪的结果

>>> a = np.random.randint(0, 10, (1000, 1000))
>>> a += a.T  # using the += operator
>>> np.allclose(a, a.T)
False
>>> a = np.random.randint(0, 10, (1000, 1000))
>>> a = a + a.T # using the + operator
>>> np.allclose(a, a.T)
True

I've got same cut-off at size 90x90 like @Hannes (I am on Python 2.7 and Numpy 1.11.0, so there are at least two environments that can produce this) 像@Hannes一样,我的大小为90x90(我在Python 2.7和Numpy 1.11.0,所以至少有两个环境可以产生这个)

>>> a = np.random.randint(0, 10, (90, 90))
>>> a += a.T
>>> np.allclose(a, a.T)
True
>>> a = np.random.randint(0, 10, (91, 91))
>>> a += a.T
>>> np.allclose(a, a.T)
False

For large arrays, the in-place operator causes you to apply the addition to the operator currently being operated on. 对于大型阵列,就地操作员会使您将添加应用于当前正在操作的操作员。 For example: 例如:

>>> a = np.random.randint(0, 10, (1000, 1000))
>>> a
array([[9, 4, 2, ..., 7, 0, 6],
       [8, 4, 1, ..., 3, 5, 9],
       [6, 4, 9, ..., 6, 9, 7],
       ..., 
       [6, 2, 5, ..., 0, 4, 6],
       [5, 7, 9, ..., 8, 0, 5],
       [2, 0, 1, ..., 4, 3, 5]])

Notice that the top-right and bottom-left elements are 6 and 2. 请注意,右上角和左下角的元素是6和2。

>>> a += a.T
>>> a
array([[18, 12,  8, ..., 13,  5,  8],
       [12,  8,  5, ...,  5, 12,  9],
       [ 8,  5, 18, ..., 11, 18,  8],
       ..., 
       [19,  7, 16, ...,  0, 12, 10],
       [10, 19, 27, ..., 12,  0,  8],
       [10,  9,  9, ..., 14, 11, 10]])

Now notice that the top-right element is correct (8 = 6 + 2). 现在注意右上角的元素是正确的(8 = 6 + 2)。 However, the bottom-left element is the result not of 6 + 2, but of 8 + 2. In other words, the addition that was applied to the bottom-left element is the top-right element of the array after the addition . 但是,左下角的元素不是6 + 2的结果,而是8 + 2的结果。换句话说,应用于左下角元素的相加是添加后数组的右上角元素。 You'll notice this is true for all the other elements below the first row as well. 您会注意到第一行下面的所有其他元素也是如此。

I imagine this works this way because you do not need to make a copy of your array. 我想这样可以这样工作,因为你不需要复制你的数组。 (Though it then looks like it does make a copy if the array is small.) (虽然如果数组很小,它看起来确实会复制。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM