简体   繁体   English

确保matlab的`fread`的python等价

[英]Ensuring python equivalence of matlab's `fread`

I have a binary test file found at http://jmp.sh/VpTZxgQ and I am trying to rewrite some matlab code in python which reads this file. 我在http://jmp.sh/VpTZxgQ找到了一个二进制测试文件,我试图在python中重写一些读取此文件的matlab代码。

What I have realised is that matlab's fread remembers what has already been read so that it skips the number of bytes that have already been read. 我已经意识到,matlab的fread记住已经读取的内容,以便它跳过已经读取的字节数。 How do I ensure I get the same behaviour in python? 我如何确保在python中获得相同的行为?

Matlab Code: Matlab代码:

clear all; close all;

path = pwd;
ext = 'bin';
stem = 'test';
filename = [stem,'.',ext];
filename = fullfile(path,filename);
fid = fopen(filename,'r');

fread(fid,2,'int16')
fread(fid,32,'char')
fread(fid,2,'int16')

Python Code: Python代码:

import numpy as np  

def fread(filename, n, precision):
     with open(filename, 'rb') as fid:
         data_array = np.fromfile(fid, precision).reshape((-1, 1)).T

     return data_array[0,0:n]

print fread('test.bin', 2, np.int16)                                                                                                                         
print fread('test.bin', 32, np.str)
print fread('test.bin', 2, np.int16) 

Ideally I would want the output of these formulations to be the same, but they are not. 理想情况下,我希望这些配方的输出相同,但它们不是。 In fact python gives a value error when I try to set precision to np.str ... 事实上,当我尝试将precision设置为np.str时,python会出现value error ...

As a bonus question - I'm assuming that reading a binary file and making sense of the data requires that the user has an understanding of how the data was formatted in order to make any sensible information of the data. 作为一个额外的问题 - 我假设读取二进制文件并理解数据需要用户了解数据的格式,以便提供数据的任何合理信息。 Is this true? 这是真的?

As the comments suggest, you need to use a file descriptor, which is what the Matlab code is doing: 正如评论所示,您需要使用文件描述符,这是Matlab代码正在做的事情:

import numpy as np

def fread(fid, nelements, dtype):
     if dtype is np.str:
         dt = np.uint8  # WARNING: assuming 8-bit ASCII for np.str!
     else:
         dt = dtype

     data_array = np.fromfile(fid, dt, nelements)
     data_array.shape = (nelements, 1)

     return data_array

fid = open('test.bin', 'rb');

print fread(fid, 2, np.int16)
print fread(fid, 32, np.str)
print fread(fid, 2, np.int16)

Reading & Writing data to a file in binary requires the reader and writer to agree on a specified format. 以二进制文件读取和写入数据需要读写器就指定的格式达成一致。 As the commenters suggest, endianess may become an issue if you save the binary on one computer and try to read it on another. 正如评论者所说,如果您将二进制文件保存在一台计算机上并尝试在另一台计算机上阅读,那么endianess可能会成为一个问题。 If the data is always written and read on the same CPU, then you won't run into the issue. 如果始终在同一CPU上写入和读取数据,那么您将不会遇到此问题。

Output for the test.bin: test.bin的输出:

MATLAB Output             Python+Numpy Output
------------------------------------------------------
ans =                     

    32                    [[32]
     0                     [ 0]]

ans =                   

    35                    [[ 35]
    32                     [ 32]
    97                     [ 97]
   102                     [102]
    48                     [ 48]
    52                     [ 52]
    50                     [ 50]
    95                     [ 95]
    53                     [ 53]
    48                     [ 48]
   112                     [112]
   101                     [101]
   114                     [114]
    99                     [ 99]
    95                     [ 95]
   115                     [115]
   112                     [112]
    97                     [ 97]
   110                     [110]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]
    32                     [ 32]]

ans =

    32                     [[32]
     0                      [ 0]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM