使用numpy.fromfile读取分散的二进制数据

Question

There are different blocks in a binary that I want to read using a single call of numpy.fromfile . 我想使用一次numpy.fromfile调用来读取二进制文件中的不同块。 Each block has the following format: 每个块具有以下格式：

OES=[
('EKEY','i4',1), 
('FD1','f4',1),
('EX1','f4',1),
('EY1','f4',1),
('EXY1','f4',1),
('EA1','f4',1),
('EMJRP1','f4',1),
('EMNRP1','f4',1),
('EMAX1','f4',1),
('FD2','f4',1),
('EX2','f4',1),
('EY2','f4',1),
('EXY2','f4',1),
('EA2','f4',1),
('EMJRP2','f4',1),
('EMNRP2','f4',1),
('EMAX2','f4',1)]

Here is the format of the binary: 这是二进制文件的格式：

 Data I want (OES format repeating n times)
 ------------------------
 Useless Data
 ------------------------
 Data I want (OES format repeating m times)
 ------------------------
 etc..

I know the byte increment between the data i want and the useless data. 我知道我想要的数据和无用的数据之间的字节增量。 I also know the size of each data block i want. 我也知道我想要的每个数据块的大小。

So far, i have accomplished my goal by seeking on the file object f and then calling: 到目前为止，我已经实现了目标，方法是在文件对象f上进行查找，然后调用：

nparr = np.fromfile(f,dtype=OES,count=size)

So I have a different nparr for each data block I want and concatenated all the numpy arrays into one new array. 因此，对于我想要的每个数据块，我都有一个不同的nparr ，并将所有的numpy数组连接到一个新的数组中。

My goal is to have a single array with all the blocks i want without concatenating (for memory purposes). 我的目标是拥有一个我想要的所有块的单个数组，而无需级联（出于存储目的）。 That is, I want to call nparr = np.fromfile(f,dtype=OES) only once. 也就是说，我只想调用nparr = np.fromfile(f,dtype=OES) 。 Is there a way to accomplish this goal? 有没有办法实现这个目标？

Answer 1

That is, I want to call nparr = np.fromfile(f,dtype=OES) only once. 也就是说，我只想调用nparr = np.fromfile(f,dtype=OES) 。 Is there a way to accomplish this goal? 有没有办法实现这个目标？

No, not with a single call to fromfile() . 不，不是一次调用fromfile() 。

But if you know the complete layout of the file in advance, you can preallocate the array, and then use fromfile and seek to read the OES blocks directly into the preallocated array. 但是，如果您事先知道文件的完整布局，则可以预分配数组，然后使用fromfile并seek将OES块直接读取到预分配的数组中。 Suppose, for example, that you know the file positions of each OES block, and you know the number of records in each block. 例如，假设您知道每个OES块的文件位置，并且知道每个块中的记录数。 That is, you know: 也就是说，您知道：

file_positions = [position1, position2, ...]
numrecords = [n1, n2, ...]

Then you could do something like this (assuming f is the already opened file): 然后，您可以执行以下操作（假设f是已打开的文件）：

total = sum(numrecords)
nparr = np.empty(total, dtype=OES)
current_index = 0
for pos, n in zip(file_positions, numrecords):
    f.seek(pos)
    nparr[current_index:current_index+n] = np.fromfile(f, count=n, dtype=OES)
    current_index += n

使用numpy.fromfile读取分散的二进制数据

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-08-06 16:41:17

使用numpy.fromfile读取分散的二进制数据

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-08-06 16:41:17

解决方案1
2 已采纳 2016-08-06 16:41:17