简体   繁体   English

为什么“ numpy.ndarray.view”会忽略先前对“ numpy.ndarray.newbyteorder”的调用?

[英]Why does `numpy.ndarray.view` ignore a previous call to `numpy.ndarray.newbyteorder`?

I have a NumPy array with one element of data type uint32 : 我有一个NumPy数组,其中一个元素的数据类型为uint32

>>> import numpy as np
>>> a = np.array([123456789], dtype=np.uint32)
>>> a.dtype.byteorder
'='

Then, I can choose to interpret the data as little-endian: 然后,我可以选择将数据解释为little-endian:

>>> a.newbyteorder("<").dtype.byteorder
'<'
>>> a.newbyteorder("<")
array([123456789], dtype=uint32)

Or as big-endian: 或作为big-endian:

>>> a.newbyteorder(">").dtype.byteorder
'>'
>>> a.newbyteorder(">")
array([365779719], dtype=uint32)

Where the latter returns a different number 365779719 as my platform is little-endian - and therefore has been written to the memory in little-endian order. 由于我的平台是低位字节序,因此后者返回不同的数字365779719因此已按照低位字节序写入内存。

Now, what's unexpected for me is the fact that a following appended call to view seems to be unaffected by this interpretation: 现在,对于我来说,出乎意料的是,以下附加的view调用似乎不受此解释的影响:

>>> a.newbyteorder("<").view(np.uint8)
array([ 21, 205,  91,   7], dtype=uint8)
>>> a.newbyteorder(">").view(np.uint8)
array([ 21, 205,  91,   7], dtype=uint8)

I would have expected the numbers to be the other way round for the big-endian byte order. 我本来希望数字对于大尾数字节顺序是相反的。 Why doesn't this happen? 为什么不发生这种情况? Doesn't view view the data "through" the newbyteorder method? view查看“通过”的数据newbyteorder方法?

By the way: if I use byteswap instead of newbyteorder and therefore copy and change the bytes in the memory, I obviously get the desired result: 顺便说一句:如果我使用byteswap而不是newbyteorder并因此复制并更改内存中的字节,显然可以得到所需的结果:

>>> a.byteswap("<").view(np.uint8)
array([ 21, 205,  91,   7], dtype=uint8)
>>> a.byteswap(">").view(np.uint8)
array([  7,  91, 205,  21], dtype=uint8)

However, I don't want to copy the data. 但是,我不想复制数据。

The new byte order applied with newbyteorder is solely a property of the array's dtype; newbyteorder应用的新字节顺序仅是数组newbyteorder的属性。 a.newbyteorder("<") returns a view of a with a little-endian dtype. a.newbyteorder("<")返回的视图a与小端D型。 It doesn't change the contents of memory, and it doesn't affect the array's shape or strides. 它不会更改内存的内容,也不会影响数组的形状或步幅。

ndarray.view doesn't care about the original array's dtype, little-endian or big. ndarray.view不在乎原始数组的dtype,little-endian或big。 It cares about the array's shape, strides, and actual memory content, none of which have changed. 它关心数组的形状,步幅和实际的内存内容,而这些都没有改变。

Just to add to @user2357112's answer , from documentation : 只是从文档中添加到@ user2357112的答案中

As you can imagine from the introduction, there are two ways you can affect the relationship between the byte ordering of the array and the underlying memory it is looking at: 从介绍中可以想象,有两种方法可以影响数组的字节顺序与其查看的基础内存之间的关系:

  • Change the byte-ordering information in the array dtype so that it interprets the underlying data as being in a different byte order . 更改数组dtype中的字节顺序信息,以使其将基础数据解释为不同的字节顺序 This is the role of arr.newbyteorder() 这是arr.newbyteorder()的角色
  • Change the byte-ordering of the underlying data , leaving the dtype interpretation as it was. 更改基础数据的字节顺序 ,保持dtype解释不变。 This is what arr.byteswap() does. 这就是arr.byteswap()所做的。

My emphasis in the quote above. 我在上面的引用中强调。


Other thought gathered from comments: 其他意见来自评论:

Since newbyteorder() is similar to view() in that it just changes the interpretation of the underlying data without changing the data, it appears that a view into a view is a view to the same (original) data. 由于newbyteorder()与view()相似之处在于,它仅更改基础数据的解释而无需更改数据,因此看来,视图中的视图就是相同(原始)数据的视图。 So, yes, you cannot "chain" views (well, you can... but it is always a view to the same original data). 因此,是的,您不能“链接”视图(嗯,可以...但是它始终是相同原始数据的视图)。

How do I get the uint8 chunks in big-endian order without changing the memory, then? 那么,如何在不更改内存的情况下按大端顺序获取uint8块?

Try np.sum(a.newbyteorder('<')) (alternatively, try a.newbyteorder('<').tolist() ) and also change sign/endianness. 尝试np.sum(a.newbyteorder('<')) (或者尝试a.newbyteorder('<').tolist() ),并更改符号/字节序。 So, my answer to the above question would be that you can't do that: either the memory is changed "in-place" with byteswap() or by making a copy of data to a new memory location when accessing the elements in the view. 因此,我对上述问题的回答是您无法做到这一点:要么使用byteswap()在内存中“就地”更改内存,要么在访问内存中的元素时将数据复制到新的内存位置视图。

In [280]: a = np.array([123456789, 234567891, 345678912], dtype=np.uint32)

In [282]: a.tobytes()
Out[282]: b'\x15\xcd[\x07\xd38\xfb\r@\xa4\x9a\x14'

In [284]: a.view('uint8')
Out[284]: 
array([ 21, 205,  91,   7, 211,  56, 251,  13,  64, 164, 154,  20],
      dtype=uint8)

This is the same as a.view('<u1') and a.view('>u1') since endedness doesn't matter with single bytes. 这与a.view('<u1')a.view('>u1')相同,因为a.view('<u1')对于单个字节无关紧要。

In [291]: a.view('<u4')
Out[291]: array([123456789, 234567891, 345678912], dtype=uint32)
In [292]: a.view('>u4')
Out[292]: array([ 365779719, 3543726861, 1084529172], dtype=uint32)

A view depends entirely on the data, not on the current (last) view: 视图完全取决于数据,而不取决于当前(最后一个)视图:

In [293]: a.view('<u4').view('u1')
Out[293]: 
array([ 21, 205,  91,   7, 211,  56, 251,  13,  64, 164, 154,  20],
      dtype=uint8)
In [294]: a.view('>u4').view('u1')
Out[294]: 
array([ 21, 205,  91,   7, 211,  56, 251,  13,  64, 164, 154,  20],
      dtype=uint8)

About the idea of reshaping and reversing: 关于重塑和反转的想法:

In [295]: a.view('u1').reshape(-1,4)
Out[295]: 
array([[ 21, 205,  91,   7],
       [211,  56, 251,  13],
       [ 64, 164, 154,  20]], dtype=uint8)
In [296]: a.view('u1').reshape(-1,4)[:,::-1]
Out[296]: 
array([[  7,  91, 205,  21],
       [ 13, 251,  56, 211],
       [ 20, 154, 164,  64]], dtype=uint8)

But I can't change the view (to u4 ) of this array because it isn't contiguous: 但是我不能更改此数组的视图(到u4 ),因为它不是连续的:

In [297]: a.view('u1').reshape(-1,4)[:,::-1].view('<u4')
....
ValueError: To change to a dtype of a different size, the array must be C-contiguous

Look a bit more at the properties of this reversed array: 再看一下这个反向数组的属性:

In [298]: a1 = a.view('u1').reshape(-1,4)[:,::-1]
In [299]: a1.flags
Out[299]: 
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  ....
In [300]: a1.strides             # reversing is done with strides
Out[300]: (4, -1)

The 2 arrays share the same databuffer. 这两个阵列共享相同的数据缓冲区。 a2 just starts at a different byte: a2只是从另一个字节开始:

In [301]: a.__array_interface__['data']
Out[301]: (32659520, False)
In [302]: a1.__array_interface__['data']
Out[302]: (32659523, False)

I can't do an inplace shape change of a1 : 我无法进行a1的就地形状更改:

In [304]: a1.shape = (12,)
...
AttributeError: incompatible shape for a non-contiguous array

If I do a reshape , I get a copy (as shown by a totally different databuffer address): 如果进行reshape ,则会得到一个副本(如完全不同的数据缓冲区地址所示):

In [305]: a2 = a1.reshape(-1)
In [306]: a2
Out[306]: 
array([  7,  91, 205,  21,  13, 251,  56, 211,  20, 154, 164,  64],
      dtype=uint8)
In [307]: a2.view('<u4')
Out[307]: array([ 365779719, 3543726861, 1084529172], dtype=uint32)
In [308]: a2.__array_interface__['data']
Out[308]: (37940512, False)

So you can view the same databuffer with different endedness, but you can't view individual bytes in a different order without either making a non-contiguous array, or making a copy. 因此,您可以查看具有不同结尾的同一数据缓冲区,但是如果不创建非连续数组或进行复制,就无法以不同顺序查看单个字节。


newbyteorder docs say it is equivalent to: newbyteorder文档说它等同于:

arr.view(arr.dtype.newbytorder(new_order))

So a.view('<u4').newbyteorder('>') is the same as a.view('<u4') . 因此a.view('<u4').newbyteorder('>')a.view('<u4') None of these changes a . 这些都不改变a

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM