简体   繁体   中英

How to read binary file data into arrays?

Attempt to read a binary file in python. From the dataset page :

The pixels are stored as unsigned chars (1 byte) and take values from 0 to 255

I have tried the following, which prints (0,) , rather than a 784,000 digit array.

# -*- coding: utf8 -*-
# Processed MNIST dataset (http://cis.jhu.edu/~sachin/digit/digit.html)
import struct

f = open('data/data0', mode='rb')
data = []

print struct.unpack('<i', f.read(4))

How can I read this binary into either a 784,000 digit array (28 bytes x 28 bytes x 1k samples), or a 28x28x1000 3D array. I have never worked with binaries before, and am quite confused!

f.read() will get you an immutable array of 784,000 bytes (called a str in Python 2). If you need it to be mutable, you can use the array module and its array type capable of storing various primitives, unsigned bytes (represented by the B code) included:

from array import array

data = array('B')

with open('data/data0', 'rb') as f:
    data.fromfile(f, 784000)

This can be sliced as necessary:

EXAMPLE_SIZE = 24 * 24
examples = [data[s:s + EXAMPLE_SIZE] for s in xrange(0, len(a), EXAMPLE_SIZE)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM