二维ndarray的Numpy fromfile的填充顺序是什么？

Question

I'm trying to read a structured binary file using the numpy.fromfile() function.我正在尝试使用numpy.fromfile() function 读取结构化二进制文件。 In my case, I have a numpy.dtype() which is used to define a user defined data type to use with np.fromfile() .就我而言，我有一个numpy.dtype()用于定义用户定义的数据类型以与np.fromfile()一起使用。
I will reproduce the relevant part of the data structure here( for the full structure is rather long):我将在这里复制数据结构的相关部分（因为完整的结构相当长）：

('RawData', np.int32, (2, BlockSize))

this will read BlockSize*2 number of int32s into the field RawData , will produce a 2xBlockSize matrix.这将读取BlockSize*2个 int32 到字段RawData中，将产生一个2xBlockSize矩阵。 This is where I am having trouble because I want to replicate the behavior of Matlab's fread() function, in which the matric is filled in column order .这是我遇到麻烦的地方，因为我想复制 Matlab 的fread() function 的行为，其中矩阵按列顺序填充。 As for NumPy's fromfile() , this isn't mentioned (at least I couldn't find it).至于 NumPy 的fromfile() ，没有提到（至少我找不到）。

It doesn't matter NumPy's fromfile() should work like Matlab's fread() , but I have to know how NumPy's fromfile() works to code accordingly. NumPy 的fromfile()应该像 Matlab 的fread()一样工作并不重要，但我必须知道 NumPy 的fromfile()如何工作以进行相应的编码。

Now, the question is, what is the fill order of a 2-D array in the NumPy fromfile() function when using a custom data type?现在，问题是，使用自定义数据类型时，NumPy fromfile() function 中二维数组的填充顺序是什么？

Answer 1

fromfile and tofile read/write flat, 1d, arrays: fromfile和tofile读/写平面，1d，arrays：

In [204]: x = np.arange(1,11).astype('int32')                                                          
In [205]: x.tofile('data615')

fromfile returns a 1d array: fromfile返回一个一维数组：

In [206]: np.fromfile('data615',np.int32)                                                              
Out[206]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10], dtype=int32)

x.reshape(2,5).tofile(...) would save the same thing. x.reshape(2,5).tofile(...)会保存同样的东西。 tofile does not save dtype or shape information. tofile不保存dtype或shape信息。

reshaped to 2d, the default order is 'C':重新整形为 2d，默认顺序是 'C'：

In [207]: np.fromfile('data615',np.int32).reshape(2,5)                                                 
Out[207]: 
array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]], dtype=int32)

but it can be changed to MATLAB like:但它可以更改为 MATLAB ，如：

In [208]: np.fromfile('data615',np.int32).reshape(2,5, order='F')                                      
Out[208]: 
array([[ 1,  3,  5,  7,  9],
       [ 2,  4,  6,  8, 10]], dtype=int32)

The underlying databuffer is the same, just a 1d array of bytes.底层databuffer是相同的，只是一个 1d 字节数组。

edit编辑

The file could be read as a 2 integer structure:该文件可以读取为 2 integer 结构：

In [249]: np.fromfile('data615','i4,i4')                                                               
Out[249]: 
array([(1,  2), (3,  4), (5,  6), (7,  8), (9, 10)],
      dtype=[('f0', '<i4'), ('f1', '<i4')])
In [250]: _['f0']                                                                                      
Out[250]: array([1, 3, 5, 7, 9], dtype=int32)

It's still a 1d array, but with numbers grouped by 2s.它仍然是一个一维数组，但数字按 2 分组。

Converting to complex:转换为复杂：

In [252]: xx = np.fromfile('data615','i4,i4')                                                          
In [253]: xx['f0']+1j*xx['f1']                                                                         
Out[253]: array([1. +2.j, 3. +4.j, 5. +6.j, 7. +8.j, 9.+10.j])
In [254]: _.dtype                                                                                      
Out[254]: dtype('complex128')

If the data had been saved as floats, we could load them as complex directly:如果数据已经保存为浮点数，我们可以直接将它们加载为复数：

In [255]: x.astype(np.float32).tofile('data615f')                                                      
In [257]: xx = np.fromfile('data615f',np.complex64)                                                    
In [258]: xx                                                                                           
Out[258]: array([1. +2.j, 3. +4.j, 5. +6.j, 7. +8.j, 9.+10.j], dtype=complex64)

Another way to get the complex from the integer sequence:从 integer 序列中获取复合体的另一种方法：

In [261]: np.fromfile('data615', np.int32).reshape(5,2)                                                
Out[261]: 
array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]], dtype=int32)
In [262]: xx = np.fromfile('data615', np.int32).reshape(5,2)                                           
In [263]: xx[:,0]+1j*xx[:,1]                                                                           
Out[263]: array([1. +2.j, 3. +4.j, 5. +6.j, 7. +8.j, 9.+10.j])

Answer 2

By default, when creating a new 2-d array, NumPy will use "C" ordering, which is row-major .默认情况下，当创建一个新的二维数组时， NumPy 将使用 "C" 排序，这是row-major 。 That is the opposite of the order used by Matlab.这与 Matlab 使用的顺序相反。

For example, if BlockSize is 4, and the raw data is例如，如果BlockSize为 4，原始数据为

0 1 2 3 4 5 6 7

then the 2 x 4 array will be那么 2 x 4 数组将是

[[0, 1, 2, 3],
 [4, 5, 6, 7]]

With Matlab and that same raw data, the 2 x 4 array would be使用 Matlab 和相同的原始数据，2 x 4 阵列将是

[[0, 2, 4, 6],
 [1, 3, 5, 7]]

二维ndarray的Numpy fromfile的填充顺序是什么？

问题描述

2 个解决方案

解决方案1
2 已采纳 2020-05-01 16:47:09

edit编辑

Converting to complex:转换为复杂：

解决方案2
1 2020-05-01 15:43:39

二维ndarray的Numpy fromfile的填充顺序是什么？

问题描述

2 个解决方案

解决方案1 2 已采纳 2020-05-01 16:47:09

edit编辑

Converting to complex:转换为复杂：

解决方案2 1 2020-05-01 15:43:39

解决方案1
2 已采纳 2020-05-01 16:47:09

解决方案2
1 2020-05-01 15:43:39