numpy 的就地操作（例如`+=`）如何工作？

Question

The basic question is: What happens under the hood when doing: a[i] += b ?基本问题是：在执行以下操作时会发生什么： a[i] += b ？

Given the following:鉴于以下情况：

import numpy as np
a = np.arange(4)
i = a > 0
i
= array([False,  True,  True,  True], dtype=bool)

I understand that:我明白那个：

a[i] = x is the same as a.__setitem__(i, x) , which assigns directly to the items indicated by i a[i] = x与a.__setitem__(i, x) ，直接赋值给i指示的项
a += x is the same as a.__iadd__(x) , which does the addition in place a += x与a.__iadd__(x) ，它在原地进行加法

But what happens when I do :但是当我这样做时会发生什么：

a[i] += x

Specifically:具体来说：

Is this the same as a[i] = a[i] + x ?这与a[i] = a[i] + x吗？ (which is not an in-place operation) （这不是就地操作）
Does it make a difference in this case if i is:如果i是：
- an int index, or一个int索引，或
- an ndarray , or一个ndarray ，或
- a slice object slice对象

Background背景

The reason I started delving into this is that I encountered a non-intuitive behavior when working with duplicate indices:我开始深入研究的原因是我在处理重复索引时遇到了非直观行为：

a = np.zeros(4)
x = np.arange(4)
indices = np.zeros(4,dtype=np.int)  # duplicate indices
a[indices] += x
a
= array([ 3.,  0.,  0.,  0.])

More interesting stuff about duplicate indices in this question .在这个问题中关于重复索引的更多有趣的东西。

Answer 1

The first thing you need to realise is that a += x doesn't map exactly to a.__iadd__(x) , instead it maps to a = a.__iadd__(x) .您需要意识到的第一件事是a += x并不完全映射到a.__iadd__(x) ，而是映射到a = a.__iadd__(x) 。 Notice that the documentation specifically says that in-place operators return their result, and this doesn't have to be self (although in practice, it usually is).请注意，文档特别说明就地运算符返回其结果，并且这不必是self （尽管在实践中，它通常是）。 This means a[i] += x trivially maps to:这意味着a[i] += x简单地映射到：

a.__setitem__(i, a.__getitem__(i).__iadd__(x))

So, the addition technically happens in-place, but only on a temporary object.因此，从技术上讲，添加就地发生，但仅限于临时对象。 There is still potentially one less temporary object created than if it called __add__ , though.不过，与调用__add__ ，创建的临时对象仍然可能少一个。

Answer 2

Actually that has nothing to do with numpy.其实这与numpy无关。 There is no "set/getitem in-place" in python, these things are equivalent to a[indices] = a[indices] + x . python中没有“set/getitem in-place”，这些东西等价于a[indices] = a[indices] + x 。 Knowing that, it becomes pretty obvious what is going on.知道了这一点，发生的事情就变得很明显了。 (EDIT: As lvc writes, actually the right hand side is in place, so that it is a[indices] = (a[indices] += x) if that was legal syntax, that has largly the same effect though) （编辑：正如 lvc 所写，实际上右侧已经到位，因此它是a[indices] = (a[indices] += x)如果这是合法的语法，但效果大致相同）

Of course a += x actually is in-place, by mapping a to the np.add out argument.当然a += x实际上是就地的，通过将 a 映射到np.add out参数。

It has been discussed before and numpy cannot do anything about it as such.之前已经讨论过，numpy 对此无能为力。 Though there is an idea to have a np.add.at(array, index_expression, x) to at least allow such operations.虽然有一个想法，让np.add.at(array, index_expression, x)至少允许这样的操作。

Answer 3

As Ivc explains, there is no in-place item add method, so under the hood it uses __getitem__ , then __iadd__ , then __setitem__ .正如 Ivc 解释的那样，没有就地项目添加方法，所以在__iadd__它使用__getitem__ ，然后是__iadd__ ，然后是__setitem__ 。 Here's a way to empirically observe that behavior:这是一种凭经验观察该行为的方法：

import numpy

class A(numpy.ndarray):
    def __getitem__(self, *args, **kwargs):
        print("getitem")
        return numpy.ndarray.__getitem__(self, *args, **kwargs)
    def __setitem__(self, *args, **kwargs):
        print("setitem")
        return numpy.ndarray.__setitem__(self, *args, **kwargs)
    def __iadd__(self, *args, **kwargs):
        print("iadd")
        return numpy.ndarray.__iadd__(self, *args, **kwargs)

a = A([1,2,3])
print("about to increment a[0]")
a[0] += 1

It prints它打印

about to increment a[0]
getitem
iadd
setitem

Answer 4

I don't know what's going on under the hood, but in-place operations on items in NumPy arrays and in Python lists will return the same reference, which IMO can lead to confusing results when passed into a function.我不知道幕后发生了什么，但是对 NumPy 数组和 Python 列表中的项目进行的就地操作将返回相同的引用，IMO 在传递给函数时可能会导致混淆结果。

Start with Python从 Python 开始

>>> a = [1, 2, 3]
>>> b = a
>>> a is b
True
>>> id(a[2])
12345
>>> id(b[2])
12345

... where 12345 is a unique id for the location of the value at a[2] in memory, which is the same as b[2] . ... 其中12345是内存中a[2]处值的位置的唯一id ，与b[2]相同。

So a and b refer to the same list in memory.所以a和b指的是内存中的同一个列表。 Now try in-place addition on an item in the list.现在尝试对列表中的项目进行就地添加。

>>> a[2] += 4
>>> a
[1, 2, 7]
>>> b
[1, 2, 7]
>>> a is b
True
>>> id(a[2])
67890
>>> id(b[2])
67890

So in-place addition of the item in the list only changed the value of the item at index 2 , but a and b still reference the same list, although the 3rd item in the list was reassigned to a new value, 7 .因此，就地添加列表中的项目仅更改了索引2处项目的值，但a和b仍引用相同的列表，尽管列表中的第 3 项已重新分配给新值7 。 The reassignment explains why if a = 4 and b = a were integers (or floats) instead of lists, then a += 1 would cause a to be reassigned, and then b and a would be different references.重新赋值解释了为什么如果a = 4和b = a是整数（或浮点数）而不是列表，那么a += 1将导致a被重新赋值，然后b和a将成为不同的引用。 However, if list addition is called, eg : a += [5] for a and b referencing the same list, it does not reassign a ;然而，如果列表添加被调用，例如： a += [5]为a和b引用相同的列表，它不重新分配a ; they will both be appended.它们都将被附加。

Now for NumPy现在是 NumPy

>>> import numpy as np
>>> a = np.array([1, 2, 3], float)
>>> b = a
>>> a is b
True

Again these are the same reference, and in-place operators seem have the same effect as for list in Python:同样，这些是相同的引用，就地运算符似乎与 Python 中的 list 具有相同的效果：

>>> a += 4
>>> a
array([ 5.,  6.,  7.])
>>> b
array([ 5.,  6.,  7.])

In place addition of an ndarray updates the reference.代替添加ndarray更新引用。 This is not the same as calling numpy.add which creates a copy in a new reference.这与调用numpy.add ，后者在新引用中创建副本。

>>> a = a + 4
>>> a
array([  9.,  10.,  11.])
>>> b
array([ 5.,  6.,  7.])

In-place operations on borrowed references借用引用的就地操作

I think the danger here is if the reference is passed to a different scope.我认为这里的危险是如果引用传递到不同的范围。

>>> def f(x):
...     x += 4
...     return x

The argument reference to x is passed into the scope of f which does not make a copy and in fact changes the value at that reference and passes it back.对x的参数引用被传递到f的作用域中，它不进行复制，实际上更改了该引用处的值并将其传回。

>>> f(a)
array([ 13.,  14.,  15.])
>>> f(a)
array([ 17.,  18.,  19.])
>>> f(a)
array([ 21.,  22.,  23.])
>>> f(a)
array([ 25.,  26.,  27.])

The same would be true for a Python list as well:对于 Python 列表也是如此：

>>> def f(x, y):
...     x += [y]

>>> a = [1, 2, 3]
>>> b = a
>>> f(a, 5)
>>> a
[1, 2, 3, 5]
>>> b
[1, 2, 3, 5]

IMO this can be confusing and sometimes difficult to debug, so I try to only use in-place operators on references that belong to the current scope, and I try be careful of borrowed references. IMO 这可能会令人困惑，有时难以调试，所以我尝试只对属于当前范围的引用使用就地运算符，并且我尽量小心借用引用。

numpy 的就地操作（例如`+=`）如何工作？

问题描述

4 个解决方案

解决方案1
17 已采纳 2013-04-16 10:59:41

解决方案2
5 2013-04-16 10:53:16

解决方案3
2 2013-04-16 11:10:51

解决方案4
2 2016-05-04 23:55:31

Start with Python从 Python 开始

Now for NumPy现在是 NumPy

In-place operations on borrowed references借用引用的就地操作

numpy 的就地操作（例如`+=`）如何工作？

问题描述

4 个解决方案

解决方案1 17 已采纳 2013-04-16 10:59:41

解决方案2 5 2013-04-16 10:53:16

解决方案3 2 2013-04-16 11:10:51

解决方案4 2 2016-05-04 23:55:31

Start with Python从 Python 开始

Now for NumPy现在是 NumPy

In-place operations on borrowed references借用引用的就地操作

解决方案1
17 已采纳 2013-04-16 10:59:41

解决方案2
5 2013-04-16 10:53:16

解决方案3
2 2013-04-16 11:10:51

解决方案4
2 2016-05-04 23:55:31