简体   繁体   English

创建 numpy 数组的置换浅表副本

[英]Create a permuted shallow copy of a numpy array

I am looking to have two different views of the same data with the rows in a different order such that changes done through one view will be reflected in the other.我希望对同一数据有两个不同的视图,其中行的顺序不同,这样通过一个视图所做的更改将反映在另一个视图中。 Specifically, the following code具体如下代码

# Create original array
A = numpy.array([[0, 1, 2],
                 [3, 4, 5],
                 [6, 7, 8]])
B = A.view()[[0, 2, 1], :] # Permute the rows
print("(before) B =\n", B)

# Change a value in A
A[1, 2] = 143
print("(after) A =\n", A)
print("(after) B =\n", B)

has the following output:有以下output:

(before) B =
 [[0 1 2]
  [6 7 8]
  [3 4 5]]
(after) A =
 [[  0   1   2]
  [  3   4 143]
  [  6   7   8]]
(after) B =
 [[0 1 2]
  [6 7 8]
  [3 4 5]]

but I would like the last bit of that to be但我希望最后一点是

(after) B =
 [[0   1   2]
  [6   7   8]
  [3   4 143]]

Answers to this question state that getting a view at specific indices is not possible, though the OP for that question is asking about a subset of the array, whereas I would like a view of the entire array. 这个问题的答案 state 无法查看特定索引的视图,尽管该问题的 OP 正在询问数组的子集,而我想要整个数组的视图。 (It seems that the key difference here is slicing vs. smart indexing) (似乎这里的关键区别是切片与智能索引)

A different post asking about slicing by rows and then columns vs columns and then rows has an accepted answer that states "All that matters is whether you slice by rows or by columns..." . 一篇询问按行切片,然后按列切片,然后按列切片,然后按行切片的帖子有一个公认的答案,其中指出“重要的是您是按行切片还是按列切片......” So I tried dealing with a flattened view of the array..所以我尝试处理阵列的扁平视图..

A = numpy.array([[0, 1, 2],
                 [3, 4, 5],
                 [6, 7, 8]])
B = A.view()
B.shape = (A.size,)

A[1, 2] = 198
print("(After first) A =\n", A)
print("(After first) B =\n", B)

# Identity index map
all_idx = numpy.arange(A.size).reshape(A.shape)

# Swapped and flattened index map
new_row_idx = all_idx[[0, 2, 1]].flatten()

C = B[new_row_idx]

print("(Before second) C =\n", C)

# Manipulate through 'B'
B[7] = 666

print("(After second) B =\n", B)
print("(After second) C =\n", C)

which gives the following output:这给出了以下 output:

(After first) A =
 [[  0   1   2]
 [  3   4 198]
 [  6   7   8]]
(After first) B =
 [  0   1   2   3   4 198   6   7   8]
(Before second) C =
 [  0   1   2   6   7   8   3   4 198]
(After second) B =
 [  0   1   2   3   4 198   6 666   8]
(After second) C =
 [  0   1   2   6   7   8   3   4 198]

As you can see, the 4th entry of C is unaltered.如您所见, C的第 4 个条目未更改。 The suggested solution to the first post I mentioned is to create a copy, make changes, and then update the original array. 我提到的第一篇文章的建议解决方案是创建一个副本,进行更改,然后更新原始数组。 I can write functions to wrap this, but this doesn't eliminate the number of times I will be making copies.我可以编写函数来包装它,但这并不能消除我复制的次数。 All it does is hide it from the user.它所做的只是对用户隐藏它。

What am I missing here?我在这里想念什么? Should I be using the data attribute of these arrays?我应该使用这些 arrays 的data属性吗? If so, what is a good starting point for understanding how to do this?如果是这样,了解如何做到这一点的一个好的起点是什么?

An array has a shape , strides , dtype and 1d data_buffer.数组具有shapestridesdtype和 1d data_buffer。 A view will have its own shape , strides , dtype , and pointer to some place in the base's data_buffer. view将有自己的shapestridesdtype和指向 base 的 data_buffer 中某个位置的指针。 Indexing with a slice can be achieved with just these attributes.仅使用这些属性就可以实现slice索引。

But indexing with a list such as your [0,2,1] cannot be achieved this way.但是无法以这种方式实现使用 [0,2,1] 之类的列表进行索引。 So numpy makes a new array with its own data_buffer, a copy .所以numpy用自己的 data_buffer 创建一个新数组,一个copy That [0,2,1] index list/array is not stored with the copy. [0,2,1] 索引列表/数组不与副本一起存储。

In [43]: A = np.arange(9).reshape(3,3)
In [44]: B = A[[0,2,1],:]
In [45]: A
Out[45]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
In [46]: B
Out[46]: 
array([[0, 1, 2],
       [6, 7, 8],
       [3, 4, 5]])

ravel shows the order of elements in the data_base: ravel显示了数据库中元素的顺序:

In [47]: A.ravel()
Out[47]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])

The order of elements in B is different. B中元素的顺序不同。

In [48]: B.ravel()
Out[48]: array([0, 1, 2, 6, 7, 8, 3, 4, 5])

In contrast, consider a row reordering with a slice:相反,考虑使用切片重新排序的行:

In [49]: C = A[::-1,:]
In [50]: C
Out[50]: 
array([[6, 7, 8],
       [3, 4, 5],
       [0, 1, 2]])

In [52]: A.strides
Out[52]: (24, 8)

This is achieved by simply changing the strides :这是通过简单地改变strides来实现的:

In [53]: C.strides
Out[53]: (-24, 8)

Transpose is also a view, with changed strides: Transpose 也是一个视图,但步幅发生了变化:

In [54]: D = A.T
In [55]: D.strides
Out[55]: (8, 24)

I was going to show the C.ravel() , but realized that reshape makes a copy (even though C is a view).我打算展示C.ravel() ,但意识到 reshape 会复制(即使C是一个视图)。

The fundamental point is that anything that numpy describes as advanced indexing will make a copy.最基本的一点是,任何被numpy描述为advanced indexing的东西都会复制。 Changes to the copy will not appear in the original array.对副本的更改不会出现在原始数组中。 https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM