[英]Why does `numpy.ndarray.view` ignore a previous call to `numpy.ndarray.newbyteorder`?
I have a NumPy array with one element of data type uint32
: 我有一个NumPy数组,其中一个元素的数据类型为
uint32
:
>>> import numpy as np
>>> a = np.array([123456789], dtype=np.uint32)
>>> a.dtype.byteorder
'='
Then, I can choose to interpret the data as little-endian: 然后,我可以选择将数据解释为little-endian:
>>> a.newbyteorder("<").dtype.byteorder
'<'
>>> a.newbyteorder("<")
array([123456789], dtype=uint32)
Or as big-endian: 或作为big-endian:
>>> a.newbyteorder(">").dtype.byteorder
'>'
>>> a.newbyteorder(">")
array([365779719], dtype=uint32)
Where the latter returns a different number 365779719
as my platform is little-endian - and therefore has been written to the memory in little-endian order. 由于我的平台是低位字节序,因此后者返回不同的数字
365779719
因此已按照低位字节序写入内存。
Now, what's unexpected for me is the fact that a following appended call to view
seems to be unaffected by this interpretation: 现在,对于我来说,出乎意料的是,以下附加的
view
调用似乎不受此解释的影响:
>>> a.newbyteorder("<").view(np.uint8)
array([ 21, 205, 91, 7], dtype=uint8)
>>> a.newbyteorder(">").view(np.uint8)
array([ 21, 205, 91, 7], dtype=uint8)
I would have expected the numbers to be the other way round for the big-endian byte order. 我本来希望数字对于大尾数字节顺序是相反的。 Why doesn't this happen?
为什么不发生这种情况? Doesn't
view
view the data "through" the newbyteorder
method? 不
view
查看“通过”的数据newbyteorder
方法?
By the way: if I use byteswap
instead of newbyteorder
and therefore copy and change the bytes in the memory, I obviously get the desired result: 顺便说一句:如果我使用
byteswap
而不是newbyteorder
并因此复制并更改内存中的字节,显然可以得到所需的结果:
>>> a.byteswap("<").view(np.uint8)
array([ 21, 205, 91, 7], dtype=uint8)
>>> a.byteswap(">").view(np.uint8)
array([ 7, 91, 205, 21], dtype=uint8)
However, I don't want to copy the data. 但是,我不想复制数据。
The new byte order applied with newbyteorder
is solely a property of the array's dtype; 与
newbyteorder
应用的新字节顺序仅是数组newbyteorder
的属性。 a.newbyteorder("<")
returns a view of a
with a little-endian dtype. a.newbyteorder("<")
返回的视图a
与小端D型。 It doesn't change the contents of memory, and it doesn't affect the array's shape or strides. 它不会更改内存的内容,也不会影响数组的形状或步幅。
ndarray.view
doesn't care about the original array's dtype, little-endian or big. ndarray.view
不在乎原始数组的dtype,little-endian或big。 It cares about the array's shape, strides, and actual memory content, none of which have changed. 它关心数组的形状,步幅和实际的内存内容,而这些都没有改变。
Just to add to @user2357112's answer , from documentation : 只是从文档中添加到@ user2357112的答案中 :
As you can imagine from the introduction, there are two ways you can affect the relationship between the byte ordering of the array and the underlying memory it is looking at:
从介绍中可以想象,有两种方法可以影响数组的字节顺序与其查看的基础内存之间的关系:
- Change the byte-ordering information in the array dtype so that it interprets the underlying data as being in a different byte order .
更改数组dtype中的字节顺序信息,以使其将基础数据解释为不同的字节顺序 。 This is the role of
arr.newbyteorder()
这是
arr.newbyteorder()
的角色- Change the byte-ordering of the underlying data , leaving the dtype interpretation as it was.
更改基础数据的字节顺序 ,保持dtype解释不变。 This is what
arr.byteswap()
does.这就是
arr.byteswap()
所做的。
My emphasis in the quote above. 我在上面的引用中强调。
Other thought gathered from comments: 其他意见来自评论:
Since newbyteorder() is similar to view() in that it just changes the interpretation of the underlying data without changing the data, it appears that a view into a view is a view to the same (original) data. 由于newbyteorder()与view()相似之处在于,它仅更改基础数据的解释而无需更改数据,因此看来,视图中的视图就是相同(原始)数据的视图。 So, yes, you cannot "chain" views (well, you can... but it is always a view to the same original data).
因此,是的,您不能“链接”视图(嗯,可以...但是它始终是相同原始数据的视图)。
How do I get the
uint8
chunks in big-endian order without changing the memory, then?那么,如何在不更改内存的情况下按大端顺序获取
uint8
块?
Try np.sum(a.newbyteorder('<'))
(alternatively, try a.newbyteorder('<').tolist()
) and also change sign/endianness. 尝试
np.sum(a.newbyteorder('<'))
(或者尝试a.newbyteorder('<').tolist()
),并更改符号/字节序。 So, my answer to the above question would be that you can't do that: either the memory is changed "in-place" with byteswap()
or by making a copy of data to a new memory location when accessing the elements in the view. 因此,我对上述问题的回答是您无法做到这一点:要么使用
byteswap()
在内存中“就地”更改内存,要么在访问内存中的元素时将数据复制到新的内存位置视图。
In [280]: a = np.array([123456789, 234567891, 345678912], dtype=np.uint32)
In [282]: a.tobytes()
Out[282]: b'\x15\xcd[\x07\xd38\xfb\r@\xa4\x9a\x14'
In [284]: a.view('uint8')
Out[284]:
array([ 21, 205, 91, 7, 211, 56, 251, 13, 64, 164, 154, 20],
dtype=uint8)
This is the same as a.view('<u1')
and a.view('>u1')
since endedness doesn't matter with single bytes. 这与
a.view('<u1')
和a.view('>u1')
相同,因为a.view('<u1')
对于单个字节无关紧要。
In [291]: a.view('<u4')
Out[291]: array([123456789, 234567891, 345678912], dtype=uint32)
In [292]: a.view('>u4')
Out[292]: array([ 365779719, 3543726861, 1084529172], dtype=uint32)
A view depends entirely on the data, not on the current (last) view: 视图完全取决于数据,而不取决于当前(最后一个)视图:
In [293]: a.view('<u4').view('u1')
Out[293]:
array([ 21, 205, 91, 7, 211, 56, 251, 13, 64, 164, 154, 20],
dtype=uint8)
In [294]: a.view('>u4').view('u1')
Out[294]:
array([ 21, 205, 91, 7, 211, 56, 251, 13, 64, 164, 154, 20],
dtype=uint8)
About the idea of reshaping and reversing: 关于重塑和反转的想法:
In [295]: a.view('u1').reshape(-1,4)
Out[295]:
array([[ 21, 205, 91, 7],
[211, 56, 251, 13],
[ 64, 164, 154, 20]], dtype=uint8)
In [296]: a.view('u1').reshape(-1,4)[:,::-1]
Out[296]:
array([[ 7, 91, 205, 21],
[ 13, 251, 56, 211],
[ 20, 154, 164, 64]], dtype=uint8)
But I can't change the view (to u4
) of this array because it isn't contiguous: 但是我不能更改此数组的视图(到
u4
),因为它不是连续的:
In [297]: a.view('u1').reshape(-1,4)[:,::-1].view('<u4')
....
ValueError: To change to a dtype of a different size, the array must be C-contiguous
Look a bit more at the properties of this reversed array: 再看一下这个反向数组的属性:
In [298]: a1 = a.view('u1').reshape(-1,4)[:,::-1]
In [299]: a1.flags
Out[299]:
C_CONTIGUOUS : False
F_CONTIGUOUS : False
....
In [300]: a1.strides # reversing is done with strides
Out[300]: (4, -1)
The 2 arrays share the same databuffer. 这两个阵列共享相同的数据缓冲区。
a2
just starts at a different byte: a2
只是从另一个字节开始:
In [301]: a.__array_interface__['data']
Out[301]: (32659520, False)
In [302]: a1.__array_interface__['data']
Out[302]: (32659523, False)
I can't do an inplace shape change of a1
: 我无法进行
a1
的就地形状更改:
In [304]: a1.shape = (12,)
...
AttributeError: incompatible shape for a non-contiguous array
If I do a reshape
, I get a copy (as shown by a totally different databuffer address): 如果进行
reshape
,则会得到一个副本(如完全不同的数据缓冲区地址所示):
In [305]: a2 = a1.reshape(-1)
In [306]: a2
Out[306]:
array([ 7, 91, 205, 21, 13, 251, 56, 211, 20, 154, 164, 64],
dtype=uint8)
In [307]: a2.view('<u4')
Out[307]: array([ 365779719, 3543726861, 1084529172], dtype=uint32)
In [308]: a2.__array_interface__['data']
Out[308]: (37940512, False)
So you can view the same databuffer with different endedness, but you can't view individual bytes in a different order without either making a non-contiguous array, or making a copy. 因此,您可以查看具有不同结尾的同一数据缓冲区,但是如果不创建非连续数组或进行复制,就无法以不同顺序查看单个字节。
newbyteorder
docs say it is equivalent to: newbyteorder
文档说它等同于:
arr.view(arr.dtype.newbytorder(new_order))
So a.view('<u4').newbyteorder('>')
is the same as a.view('<u4')
. 因此
a.view('<u4').newbyteorder('>')
与a.view('<u4')
。 None of these changes a
. 这些都不改变
a
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.