通過 Python 從 .idx3-ubyte 文件或 GZIP 中提取圖像

Question

我使用 OpenCV 的 facerecognizer 創建了一個簡單的人臉識別函數。 它適用於人的圖像。

現在我想用手寫字符代替人來做一個測試。 我遇到了 MNIST 數據集，但它們將圖像存儲在一個我從未見過的奇怪文件中。

我只需要從中提取一些圖像：

train-images.idx3-ubyte

並將它們保存在一個文件夾中為.gif

或者我誤解了這個 MNIST 的事情。 如果是，我在哪里可以獲得這樣的數據集？

編輯

我也有 gzip 文件：

train-images-idx3-ubyte.gz

我正在嘗試閱讀內容，但show()不起作用，如果我read()我看到隨機符號。

images = gzip.open("train-images-idx3-ubyte.gz", 'rb')
print images.read()

編輯

通過使用管理獲得一些有用的輸出：

with gzip.open('train-images-idx3-ubyte.gz','r') as fin:
    for line in fin:
        print('got line', line)

不知何故，我現在必須將其轉換為圖像，輸出：

Answer 1

下載訓練/測試圖像和標簽：

train-images-idx3-ubyte.gz：訓練集圖像
train-labels-idx1-ubyte.gz：訓練集標簽
t10k-images-idx3-ubyte.gz：測試集圖像
t10k-labels-idx1-ubyte.gz：測試集標簽

並將它們解壓縮到工作目錄中，例如samples/ 。

從 PyPi 獲取python-mnist包：

pip install python-mnist

導入mnist包並讀取訓練/測試圖像：

from mnist import MNIST

mndata = MNIST('samples')

images, labels = mndata.load_training()
# or
images, labels = mndata.load_testing()

向控制台顯示圖像：

index = random.randrange(0, len(images))  # choose an index ;-)
print(mndata.display(images[index]))

你會得到這樣的東西：

............................
............................
............................
............................
............................
.................@@.........
..............@@@@@.........
............@@@@............
..........@@................
..........@.................
...........@................
...........@................
...........@...@............
...........@@@@@.@..........
...........@@@...@@.........
...........@@.....@.........
..................@.........
..................@@........
..................@@........
..................@.........
.................@@.........
...........@.....@..........
...........@....@@..........
............@@@@............
.............@..............
............................
............................
............................

說明：

圖像列表的每個圖像都是一個無符號字節的 Python list 。
標簽是一個 Python 無符號字節array 。

Answer 2

（僅使用 matplotlib、gzip 和 numpy）
提取圖像數據：

import gzip
f = gzip.open('train-images-idx3-ubyte.gz','r')

image_size = 28
num_images = 5

import numpy as np
f.read(16)
buf = f.read(image_size * image_size * num_images)
data = np.frombuffer(buf, dtype=np.uint8).astype(np.float32)
data = data.reshape(num_images, image_size, image_size, 1)

打印圖像：

import matplotlib.pyplot as plt
image = np.asarray(data[2]).squeeze()
plt.imshow(image)
plt.show()

打印前 50 個標簽：

f = gzip.open('train-labels-idx1-ubyte.gz','r')
f.read(8)
for i in range(0,50):   
    buf = f.read(1)
    labels = np.frombuffer(buf, dtype=np.uint8).astype(np.int64)
    print(labels)

Answer 3

您實際上可以使用 PyPI 提供的idx2numpy包。 使用起來極其簡單，直接將數據轉換為numpy數組。 這是你必須做的：

下載數據

從官網下載MNIST數據集。
如果您使用的是 Linux，那么您可以使用wget從命令行本身獲取它。 只需運行：

wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

解壓數據

解壓縮或解壓縮數據。 在 Linux 上，您可以使用gzip

最終，您應該擁有以下文件：

data/train-images-idx3-ubyte
data/train-labels-idx1-ubyte
data/t10k-images-idx3-ubyte
data/t10k-labels-idx1-ubyte

前綴data/只是因為我已將它們提取到名為data的文件夾中。 你的問題看起來你在這里做得很好，所以繼續閱讀。

使用 idx2numpy

這是一個簡單的python代碼，用於將解壓縮文件中的所有內容作為numpy數組讀取。

import idx2numpy
import numpy as np
file = 'data/train-images-idx3-ubyte'
arr = idx2numpy.convert_from_file(file)
# arr is now a np.ndarray type of object of shape 60000, 28, 28

您現在可以使用 OpenCV 以與顯示任何其他圖像相同的方式使用它，使用類似

cv.imshow("Image", arr[4])

要安裝 idx2numpy，您可以使用 PyPI（ pip包管理器）。 只需運行以下命令：

pip install idx2numpy

Answer 4

import gzip
import numpy as np


def training_images():
    with gzip.open('data/train-images-idx3-ubyte.gz', 'r') as f:
        # first 4 bytes is a magic number
        magic_number = int.from_bytes(f.read(4), 'big')
        # second 4 bytes is the number of images
        image_count = int.from_bytes(f.read(4), 'big')
        # third 4 bytes is the row count
        row_count = int.from_bytes(f.read(4), 'big')
        # fourth 4 bytes is the column count
        column_count = int.from_bytes(f.read(4), 'big')
        # rest is the image pixel data, each pixel is stored as an unsigned byte
        # pixel values are 0 to 255
        image_data = f.read()
        images = np.frombuffer(image_data, dtype=np.uint8)\
            .reshape((image_count, row_count, column_count))
        return images


def training_labels():
    with gzip.open('data/train-labels-idx1-ubyte.gz', 'r') as f:
        # first 4 bytes is a magic number
        magic_number = int.from_bytes(f.read(4), 'big')
        # second 4 bytes is the number of labels
        label_count = int.from_bytes(f.read(4), 'big')
        # rest is the label data, each label is stored as unsigned byte
        # label values are 0 to 9
        label_data = f.read()
        labels = np.frombuffer(label_data, dtype=np.uint8)
        return labels

Answer 5

安裝 idx2numpy

pip install idx2numpy

下載數據

從官網下載MNIST數據集。

解壓數據

最終，您應該擁有以下文件：

train-images-idx3-ubyte
train-labels-idx1-ubyte
t10k-images-idx3-ubyte
t10k-labels-idx1-ubyte

使用 idx2numpy

    import numpy as np
    import idx2numpy
    import matplotlib.pyplot as plt
    
    imagefile = 'train-images.idx3-ubyte'
    imagearray = idx2numpy.convert_from_file(imagefile)
    
    plt.imshow(imagearray[4], cmap=plt.cm.binary)

Answer 6

使用它將mnist數據庫提取到python中的圖像和csv標簽：

https://github.com/sorki/python-mnist

Answer 7

這里直接給你一個功能！ （它以二進制格式加載。即 0 或 1）。

def load_mnist(train_data=True, test_data=False):
    """
    Get mnist data from the official website and
    load them in binary format.

    Parameters
    ----------
    train_data : bool
        Loads
        'train-images-idx3-ubyte.gz'
        'train-labels-idx1-ubyte.gz'
    test_data : bool
        Loads
        't10k-images-idx3-ubyte.gz'
        't10k-labels-idx1-ubyte.gz' 

    Return
    ------
    tuple
    tuple[0] are images (train & test)
    tuple[1] are labels (train & test)

    """
    RESOURCES = [
        'train-images-idx3-ubyte.gz',
        'train-labels-idx1-ubyte.gz',
        't10k-images-idx3-ubyte.gz',
        't10k-labels-idx1-ubyte.gz']

    if (os.path.isdir('data') == 0):
        os.mkdir('data')
    if (os.path.isdir('data/mnist') == 0):
        os.mkdir('data/mnist')
    for name in RESOURCES:
        if (os.path.isfile('data/mnist/'+name) == 0):
            url = 'http://yann.lecun.com/exdb/mnist/'+name
            r = requests.get(url, allow_redirects=True)
            open('data/mnist/'+name, 'wb').write(r.content)

    return get_images(train_data, test_data), get_labels(train_data, test_data)


def get_images(train_data=True, test_data=False):

    to_return = []

    if train_data:
        with gzip.open('data/mnist/train-images-idx3-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of images
            image_count = int.from_bytes(f.read(4), 'big')
            # third 4 bytes is the row count
            row_count = int.from_bytes(f.read(4), 'big')
            # fourth 4 bytes is the column count
            column_count = int.from_bytes(f.read(4), 'big')
            # rest is the image pixel data, each pixel is stored as an unsigned byte
            # pixel values are 0 to 255
            image_data = f.read()
            train_images = np.frombuffer(image_data, dtype=np.uint8)\
                .reshape((image_count, row_count, column_count))
            to_return.append(np.where(train_images > 127, 1, 0))

    if test_data:
        with gzip.open('data/mnist/t10k-images-idx3-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of images
            image_count = int.from_bytes(f.read(4), 'big')
            # third 4 bytes is the row count
            row_count = int.from_bytes(f.read(4), 'big')
            # fourth 4 bytes is the column count
            column_count = int.from_bytes(f.read(4), 'big')
            # rest is the image pixel data, each pixel is stored as an unsigned byte
            # pixel values are 0 to 255
            image_data = f.read()
            test_images = np.frombuffer(image_data, dtype=np.uint8)\
                .reshape((image_count, row_count, column_count))
            to_return.append(np.where(test_images > 127, 1, 0))

    return to_return


def get_labels(train_data=True, test_data=False):

    to_return = []

    if train_data:
        with gzip.open('data/mnist/train-labels-idx1-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of labels
            label_count = int.from_bytes(f.read(4), 'big')
            # rest is the label data, each label is stored as unsigned byte
            # label values are 0 to 9
            label_data = f.read()
            train_labels = np.frombuffer(label_data, dtype=np.uint8)
            to_return.append(train_labels)
    if test_data:
        with gzip.open('data/mnist/t10k-labels-idx1-ubyte.gz', 'r') as f:
            # first 4 bytes is a magic number
            magic_number = int.from_bytes(f.read(4), 'big')
            # second 4 bytes is the number of labels
            label_count = int.from_bytes(f.read(4), 'big')
            # rest is the label data, each label is stored as unsigned byte
            # label values are 0 to 9
            label_data = f.read()
            test_labels = np.frombuffer(label_data, dtype=np.uint8)
            to_return.append(test_labels)

    return to_return

Answer 8

我有同樣的問題。

每當我將文件解壓縮為可執行文件時，擴展名都不會被刪除，所以我有：

train-images-idx3-ubyte.gz

每當我刪除: .gz ，我有：

train-images-idx3-ubyte

這解決了我的問題。

通過 Python 從 .idx3-ubyte 文件或 GZIP 中提取圖像

問題描述

7 個解決方案

解決方案1
64 2016-11-04 19:08:12

解決方案2
39 2018-12-01 12:18:07

解決方案3
14 2019-04-03 21:29:05

下載數據

解壓數據

使用 idx2numpy

解決方案4
8 2020-07-07 18:07:16

解決方案5
7 2020-04-24 06:58:01

解決方案6
1 2017-08-10 18:26:19

解決方案7
1 2021-03-04 16:09:26

解決方案8
-3 2020-06-27 12:26:35

通過 Python 從 .idx3-ubyte 文件或 GZIP 中提取圖像

問題描述

7 個解決方案

解決方案1 64 2016-11-04 19:08:12

解決方案2 39 2018-12-01 12:18:07

解決方案3 14 2019-04-03 21:29:05

下載數據

解壓數據

使用 idx2numpy

解決方案4 8 2020-07-07 18:07:16

解決方案5 7 2020-04-24 06:58:01

解決方案6 1 2017-08-10 18:26:19

解決方案7 1 2021-03-04 16:09:26

解決方案8 -3 2020-06-27 12:26:35

解決方案1
64 2016-11-04 19:08:12

解決方案2
39 2018-12-01 12:18:07

解決方案3
14 2019-04-03 21:29:05

解決方案4
8 2020-07-07 18:07:16

解決方案5
7 2020-04-24 06:58:01

解決方案6
1 2017-08-10 18:26:19

解決方案7
1 2021-03-04 16:09:26

解決方案8
-3 2020-06-27 12:26:35