[英]Extract images from .idx3-ubyte file or GZIP via Python
我使用 OpenCV 的 facerecognizer 創建了一個簡單的人臉識別函數。 它適用於人的圖像。
現在我想用手寫字符代替人來做一個測試。 我遇到了 MNIST 數據集,但它們將圖像存儲在一個我從未見過的奇怪文件中。
我只需要從中提取一些圖像:
train-images.idx3-ubyte
並將它們保存在一個文件夾中為.gif
或者我誤解了這個 MNIST 的事情。 如果是,我在哪里可以獲得這樣的數據集?
編輯
我也有 gzip 文件:
train-images-idx3-ubyte.gz
我正在嘗試閱讀內容,但show()
不起作用,如果我read()
我看到隨機符號。
images = gzip.open("train-images-idx3-ubyte.gz", 'rb')
print images.read()
編輯
通過使用管理獲得一些有用的輸出:
with gzip.open('train-images-idx3-ubyte.gz','r') as fin:
for line in fin:
print('got line', line)
不知何故,我現在必須將其轉換為圖像,輸出:
下載訓練/測試圖像和標簽:
並將它們解壓縮到工作目錄中,例如samples/
。
從 PyPi 獲取python-mnist包:
pip install python-mnist
導入mnist
包並讀取訓練/測試圖像:
from mnist import MNIST
mndata = MNIST('samples')
images, labels = mndata.load_training()
# or
images, labels = mndata.load_testing()
向控制台顯示圖像:
index = random.randrange(0, len(images)) # choose an index ;-)
print(mndata.display(images[index]))
你會得到這樣的東西:
............................
............................
............................
............................
............................
.................@@.........
..............@@@@@.........
............@@@@............
..........@@................
..........@.................
...........@................
...........@................
...........@...@............
...........@@@@@.@..........
...........@@@...@@.........
...........@@.....@.........
..................@.........
..................@@........
..................@@........
..................@.........
.................@@.........
...........@.....@..........
...........@....@@..........
............@@@@............
.............@..............
............................
............................
............................
說明:
list
。array
。 (僅使用 matplotlib、gzip 和 numpy)
提取圖像數據:
import gzip
f = gzip.open('train-images-idx3-ubyte.gz','r')
image_size = 28
num_images = 5
import numpy as np
f.read(16)
buf = f.read(image_size * image_size * num_images)
data = np.frombuffer(buf, dtype=np.uint8).astype(np.float32)
data = data.reshape(num_images, image_size, image_size, 1)
打印圖像:
import matplotlib.pyplot as plt
image = np.asarray(data[2]).squeeze()
plt.imshow(image)
plt.show()
打印前 50 個標簽:
f = gzip.open('train-labels-idx1-ubyte.gz','r')
f.read(8)
for i in range(0,50):
buf = f.read(1)
labels = np.frombuffer(buf, dtype=np.uint8).astype(np.int64)
print(labels)
您實際上可以使用 PyPI 提供的idx2numpy包。 使用起來極其簡單,直接將數據轉換為numpy數組。 這是你必須做的:
從官網下載MNIST數據集。
如果您使用的是 Linux,那么您可以使用wget從命令行本身獲取它。 只需運行:
wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
解壓縮或解壓縮數據。 在 Linux 上,您可以使用gzip
最終,您應該擁有以下文件:
data/train-images-idx3-ubyte
data/train-labels-idx1-ubyte
data/t10k-images-idx3-ubyte
data/t10k-labels-idx1-ubyte
前綴data/
只是因為我已將它們提取到名為data
的文件夾中。 你的問題看起來你在這里做得很好,所以繼續閱讀。
這是一個簡單的python代碼,用於將解壓縮文件中的所有內容作為numpy數組讀取。
import idx2numpy
import numpy as np
file = 'data/train-images-idx3-ubyte'
arr = idx2numpy.convert_from_file(file)
# arr is now a np.ndarray type of object of shape 60000, 28, 28
您現在可以使用 OpenCV 以與顯示任何其他圖像相同的方式使用它,使用類似
cv.imshow("Image", arr[4])
要安裝 idx2numpy,您可以使用 PyPI( pip
包管理器)。 只需運行以下命令:
pip install idx2numpy
import gzip
import numpy as np
def training_images():
with gzip.open('data/train-images-idx3-ubyte.gz', 'r') as f:
# first 4 bytes is a magic number
magic_number = int.from_bytes(f.read(4), 'big')
# second 4 bytes is the number of images
image_count = int.from_bytes(f.read(4), 'big')
# third 4 bytes is the row count
row_count = int.from_bytes(f.read(4), 'big')
# fourth 4 bytes is the column count
column_count = int.from_bytes(f.read(4), 'big')
# rest is the image pixel data, each pixel is stored as an unsigned byte
# pixel values are 0 to 255
image_data = f.read()
images = np.frombuffer(image_data, dtype=np.uint8)\
.reshape((image_count, row_count, column_count))
return images
def training_labels():
with gzip.open('data/train-labels-idx1-ubyte.gz', 'r') as f:
# first 4 bytes is a magic number
magic_number = int.from_bytes(f.read(4), 'big')
# second 4 bytes is the number of labels
label_count = int.from_bytes(f.read(4), 'big')
# rest is the label data, each label is stored as unsigned byte
# label values are 0 to 9
label_data = f.read()
labels = np.frombuffer(label_data, dtype=np.uint8)
return labels
安裝 idx2numpy
pip install idx2numpy
下載數據
從官網下載MNIST數據集。
解壓數據
最終,您應該擁有以下文件:
train-images-idx3-ubyte
train-labels-idx1-ubyte
t10k-images-idx3-ubyte
t10k-labels-idx1-ubyte
使用 idx2numpy
import numpy as np
import idx2numpy
import matplotlib.pyplot as plt
imagefile = 'train-images.idx3-ubyte'
imagearray = idx2numpy.convert_from_file(imagefile)
plt.imshow(imagearray[4], cmap=plt.cm.binary)
使用它將mnist數據庫提取到python中的圖像和csv標簽:
這里直接給你一個功能! (它以二進制格式加載。即 0 或 1)。
def load_mnist(train_data=True, test_data=False):
"""
Get mnist data from the official website and
load them in binary format.
Parameters
----------
train_data : bool
Loads
'train-images-idx3-ubyte.gz'
'train-labels-idx1-ubyte.gz'
test_data : bool
Loads
't10k-images-idx3-ubyte.gz'
't10k-labels-idx1-ubyte.gz'
Return
------
tuple
tuple[0] are images (train & test)
tuple[1] are labels (train & test)
"""
RESOURCES = [
'train-images-idx3-ubyte.gz',
'train-labels-idx1-ubyte.gz',
't10k-images-idx3-ubyte.gz',
't10k-labels-idx1-ubyte.gz']
if (os.path.isdir('data') == 0):
os.mkdir('data')
if (os.path.isdir('data/mnist') == 0):
os.mkdir('data/mnist')
for name in RESOURCES:
if (os.path.isfile('data/mnist/'+name) == 0):
url = 'http://yann.lecun.com/exdb/mnist/'+name
r = requests.get(url, allow_redirects=True)
open('data/mnist/'+name, 'wb').write(r.content)
return get_images(train_data, test_data), get_labels(train_data, test_data)
def get_images(train_data=True, test_data=False):
to_return = []
if train_data:
with gzip.open('data/mnist/train-images-idx3-ubyte.gz', 'r') as f:
# first 4 bytes is a magic number
magic_number = int.from_bytes(f.read(4), 'big')
# second 4 bytes is the number of images
image_count = int.from_bytes(f.read(4), 'big')
# third 4 bytes is the row count
row_count = int.from_bytes(f.read(4), 'big')
# fourth 4 bytes is the column count
column_count = int.from_bytes(f.read(4), 'big')
# rest is the image pixel data, each pixel is stored as an unsigned byte
# pixel values are 0 to 255
image_data = f.read()
train_images = np.frombuffer(image_data, dtype=np.uint8)\
.reshape((image_count, row_count, column_count))
to_return.append(np.where(train_images > 127, 1, 0))
if test_data:
with gzip.open('data/mnist/t10k-images-idx3-ubyte.gz', 'r') as f:
# first 4 bytes is a magic number
magic_number = int.from_bytes(f.read(4), 'big')
# second 4 bytes is the number of images
image_count = int.from_bytes(f.read(4), 'big')
# third 4 bytes is the row count
row_count = int.from_bytes(f.read(4), 'big')
# fourth 4 bytes is the column count
column_count = int.from_bytes(f.read(4), 'big')
# rest is the image pixel data, each pixel is stored as an unsigned byte
# pixel values are 0 to 255
image_data = f.read()
test_images = np.frombuffer(image_data, dtype=np.uint8)\
.reshape((image_count, row_count, column_count))
to_return.append(np.where(test_images > 127, 1, 0))
return to_return
def get_labels(train_data=True, test_data=False):
to_return = []
if train_data:
with gzip.open('data/mnist/train-labels-idx1-ubyte.gz', 'r') as f:
# first 4 bytes is a magic number
magic_number = int.from_bytes(f.read(4), 'big')
# second 4 bytes is the number of labels
label_count = int.from_bytes(f.read(4), 'big')
# rest is the label data, each label is stored as unsigned byte
# label values are 0 to 9
label_data = f.read()
train_labels = np.frombuffer(label_data, dtype=np.uint8)
to_return.append(train_labels)
if test_data:
with gzip.open('data/mnist/t10k-labels-idx1-ubyte.gz', 'r') as f:
# first 4 bytes is a magic number
magic_number = int.from_bytes(f.read(4), 'big')
# second 4 bytes is the number of labels
label_count = int.from_bytes(f.read(4), 'big')
# rest is the label data, each label is stored as unsigned byte
# label values are 0 to 9
label_data = f.read()
test_labels = np.frombuffer(label_data, dtype=np.uint8)
to_return.append(test_labels)
return to_return
我有同樣的問題。
每當我將文件解壓縮為可執行文件時,擴展名都不會被刪除,所以我有:
train-images-idx3-ubyte.gz
每當我刪除: .gz
,我有:
train-images-idx3-ubyte
這解決了我的問題。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.