[英]Python - Retrieving data and labels from a pickle file
我有一個泡菜文件,如下所示:
[array([[[148, 124, 115],
[150, 127, 116],
[154, 129, 121],
...,
[159, 142, 133],
[159, 142, 133],
[161, 145, 142]],
[[165, 136, 145],
[176, 137, 141],
[178, 138, 144],
...,
[199, 163, 171],
[202, 163, 167],
[200, 158, 163]]]), array([1, 1])]
在上一個問題中 ,我們能夠通過分別執行操作來檢索數據和標簽。 但是,當我有很多圖像時,這種方法將不合適。
我的腳本現在如下所示:
data, labels = [], []
for i in range(0, 1):
filename = 'data.pickle'
batch_data = unpickle(filename)
if len(data) > 0:
data = np.vstack((data, batch_data[0][i]))
labels = np.hstack((labels, batch_data[1][i]))
else:
data = batch_data[0][0]
labels = batch_data[1][0]
data = data.astype(np.float32)
return data, labels
例如,當我運行代碼並print
標簽時,我總是得到1
,而我期望得到兩個標簽[1 1]
(我不確定是否應將它們顯示為數組?)
我在這里做錯了什么?
謝謝。
我能夠按照您期望的方式獲得標簽。 我用了
# Create batch data that represents what you are asking, I created three labels and data
batch_data = np.array([[np.random.random((5,5)), np.random.random((5,5)), np.random.random((5,5))], np.array([1,1,1])])
#pickle the data
import pickle
pickle.dump( batch_data, open( "test.pickle", "wb" ) )
# create data and labels seperately
def test_func(batch_data):
data, labels = [], []
for i in range(0, batch_data.shape[1]):
if len(data) > 0:
data = np.vstack((data, batch_data[0][i]))
labels = np.hstack((labels, batch_data[1][i]))
else:
data = batch_data[0][0]
labels = batch_data[1][0]
data = data.astype(np.float32)
return data, labels
# unpickle
unpickled_batch_data = pickle.load(open( "test.pickle", "rb" ))
# get stacked data and labels
data, labels = test_func(unpickled_batch_data)
print labels
退貨
[1 1 1]
您只需兩次使用zip
就能擺脫zip
:
In [24]: pickle_data = [array([[[148, 124, 115],
...: [150, 127, 116],
...: [154, 129, 121],
...: [159, 142, 133],
...: [159, 142, 133],
...: [161, 145, 142]],
...:
...: [[165, 136, 145],
...: [176, 137, 141],
...: [178, 138, 144],
...: [199, 163, 171],
...: [202, 163, 167],
...: [200, 158, 163]]]), array([1, 1])]
您還需要使用*
運算符將參數解壓縮:
In [25]: data, labels = zip(*zip(*pickle_data))
In [26]: data
Out[26]:
(array([[148, 124, 115],
[150, 127, 116],
[154, 129, 121],
[159, 142, 133],
[159, 142, 133],
[161, 145, 142]]), array([[165, 136, 145],
[176, 137, 141],
[178, 138, 144],
[199, 163, 171],
[202, 163, 167],
[200, 158, 163]]))
In [27]: labels
Out[27]: (1, 1)
現在,標簽和數據按索引對應:
In [28]: data[0]
Out[28]:
array([[148, 124, 115],
[150, 127, 116],
[154, 129, 121],
[159, 142, 133],
[159, 142, 133],
[161, 145, 142]])
In [29]: data[1]
Out[29]:
array([[165, 136, 145],
[176, 137, 141],
[178, 138, 144],
[199, 163, 171],
[202, 163, 167],
[200, 158, 163]])
In [30]: labels[0]
Out[30]: 1
In [31]: labels[1]
Out[31]: 1
還是更好,我認為,由於圖像是沿第一軸存儲的,因此您可以使用列表推導將數組分解為數組列表:
In [37]: images = pickle_data[0]
In [38]: labels = pickle_data[1]
分解數組:
In [39]: images = [x for x in images]
In [40]: images[0]
Out[40]:
array([[148, 124, 115],
[150, 127, 116],
[154, 129, 121],
[159, 142, 133],
[159, 142, 133],
[161, 145, 142]])
In [41]: images[1]
Out[41]:
array([[165, 136, 145],
[176, 137, 141],
[178, 138, 144],
[199, 163, 171],
[202, 163, 167],
[200, 158, 163]])
In [42]: labels[0]
Out[42]: 1
In [43]: labels[1]
Out[43]: 1
In [44]: labels
Out[44]: array([1, 1])
In [45]: images
Out[45]:
[array([[148, 124, 115],
[150, 127, 116],
[154, 129, 121],
[159, 142, 133],
[159, 142, 133],
[161, 145, 142]]), array([[165, 136, 145],
[176, 137, 141],
[178, 138, 144],
[199, 163, 171],
[202, 163, 167],
[200, 158, 163]])]
In [46]:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.