简体   繁体   English

等同于在MATLAB for Python中读取原始字节文件

[英]Equivalent Of Reading In Raw Byte Files In MATLAB for Python

I have raw byte files that have blocks of 28x28 bytes, which represent an image. 我有原始字节文件,这些文件具有28x28字节的块,它们代表一个图像。 In each file, there are 1000 of these blocks, which are just data I want to analyze. 每个文件中有1000个这些块,这些只是我要分析的数据。 So for example 'data' has 1000 28x28 byte blocks, which represent 1000 28x28 pixel images. 因此,例如“数据”具有1000个28x28字节块,代表1000个28x28像素图像。

Right now, I know how to read this in using MATLAB: 现在,我知道如何使用MATLAB来阅读它:

fid=fopen(‘data’,’r’); // open the file 
[t1,N]=fread(fid,[28 28],’uchar’); // read in the first example and store it in a 28x28 size matrix t1
[t2,N]=fread(fid,[28 28],uchar); // read the second example into t2 and so on
//To display the image use imshow(t1) or imagesc(t1)

I was wondering how I could do the same with Python. 我想知道如何用Python做同样的事情。 I am having trouble getting it to work. 我无法使其正常工作。

I got it working by doing this. 我这样做可以使它正常工作。

fid = open('data/data' + str(digit), 'rb')
dim = np.fromfile(fid, dtype=np.uint8) # read in entire dataset

# loop through the thousand examples and smartly index them to get 28x28 images
for i in xrange(0, 1000):
    # indices to go through data file
    index = i*28*28        
    nextindex = (i+1)*28*28

    # get the images in 28x28 format & convert all pixels to binary
    newdim = dim[index:nextindex]
    newdim = newdim.reshape(28, 28)

Instead of trying to read in the file byte block by byte block, I just read in the entire file and then index through it 28*28 at a time. 我没有尝试逐字节读取文件,而是读取了整个文件,然后一次索引28 * 28。 Please let me know if I am making a mistake anywhere here. 如果我在这里的任何地方出错,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM