繁体   English   中英

Python-从泡菜文件中检索数据和标签

[英]Python - Retrieving data and labels from a pickle file

我有一个泡菜文件,如下所示:

[array([[[148, 124, 115],
        [150, 127, 116],
        [154, 129, 121],
        ..., 
        [159, 142, 133],
        [159, 142, 133],
        [161, 145, 142]],

       [[165, 136, 145],
        [176, 137, 141],
        [178, 138, 144],
        ..., 
        [199, 163, 171],
        [202, 163, 167],
        [200, 158, 163]]]), array([1, 1])]

在上一个问题中 ,我们能够通过分别执行操作来检索数据和标签。 但是,当我有很多图像时,这种方法将不合适。

我的脚本现在如下所示:

data, labels = [], []
    for i in range(0, 1):

        filename = 'data.pickle'
        batch_data = unpickle(filename)
        if len(data) > 0:
            data = np.vstack((data, batch_data[0][i]))
            labels = np.hstack((labels, batch_data[1][i]))
        else:
            data = batch_data[0][0]
            labels = batch_data[1][0]

        data = data.astype(np.float32)
        return data, labels

例如,当我运行代码并print标签时,我总是得到1 ,而我期望得到两个标签[1 1] (我不确定是否应将它们显示为数组?)

我在这里做错了什么?

谢谢。

我能够按照您期望的方式获得标签。 我用了

# Create batch data that represents what you are asking, I created three labels and data
batch_data = np.array([[np.random.random((5,5)), np.random.random((5,5)), np.random.random((5,5))], np.array([1,1,1])])

#pickle the data
import pickle
pickle.dump( batch_data, open( "test.pickle", "wb" ) )

# create data and labels seperately

def test_func(batch_data):
    data, labels = [], []
    for i in range(0, batch_data.shape[1]):
        if len(data) > 0:
            data = np.vstack((data, batch_data[0][i]))
            labels = np.hstack((labels, batch_data[1][i]))
        else:
            data = batch_data[0][0]
            labels = batch_data[1][0]
        data = data.astype(np.float32)
    return data, labels

# unpickle
unpickled_batch_data = pickle.load(open( "test.pickle", "rb" ))

# get stacked data and labels
data, labels =  test_func(unpickled_batch_data)
print labels

退货

[1 1 1]

您只需两次使用zip就能摆脱zip

In [24]: pickle_data = [array([[[148, 124, 115],
    ...:         [150, 127, 116],
    ...:         [154, 129, 121],
    ...:         [159, 142, 133],
    ...:         [159, 142, 133],
    ...:         [161, 145, 142]],
    ...:
    ...:        [[165, 136, 145],
    ...:         [176, 137, 141],
    ...:         [178, 138, 144],
    ...:         [199, 163, 171],
    ...:         [202, 163, 167],
    ...:         [200, 158, 163]]]), array([1, 1])]

您还需要使用*运算符将参数解压缩:

In [25]: data, labels = zip(*zip(*pickle_data))

In [26]: data
Out[26]:
(array([[148, 124, 115],
        [150, 127, 116],
        [154, 129, 121],
        [159, 142, 133],
        [159, 142, 133],
        [161, 145, 142]]), array([[165, 136, 145],
        [176, 137, 141],
        [178, 138, 144],
        [199, 163, 171],
        [202, 163, 167],
        [200, 158, 163]]))

In [27]: labels
Out[27]: (1, 1)

现在,标签和数据按索引对应:

In [28]: data[0]
Out[28]:
array([[148, 124, 115],
       [150, 127, 116],
       [154, 129, 121],
       [159, 142, 133],
       [159, 142, 133],
       [161, 145, 142]])

In [29]: data[1]
Out[29]:
array([[165, 136, 145],
       [176, 137, 141],
       [178, 138, 144],
       [199, 163, 171],
       [202, 163, 167],
       [200, 158, 163]])

In [30]: labels[0]
Out[30]: 1

In [31]: labels[1]
Out[31]: 1

还是更好,我认为,由于图像是沿第一轴存储的,因此您可以使用列表推导将数组分解为数组列表:

In [37]: images = pickle_data[0]

In [38]: labels = pickle_data[1]

分解数组:

In [39]: images = [x for x in images]

In [40]: images[0]
Out[40]:
array([[148, 124, 115],
       [150, 127, 116],
       [154, 129, 121],
       [159, 142, 133],
       [159, 142, 133],
       [161, 145, 142]])

In [41]: images[1]
Out[41]:
array([[165, 136, 145],
       [176, 137, 141],
       [178, 138, 144],
       [199, 163, 171],
       [202, 163, 167],
       [200, 158, 163]])

In [42]: labels[0]
Out[42]: 1

In [43]: labels[1]
Out[43]: 1

In [44]: labels
Out[44]: array([1, 1])

In [45]: images
Out[45]:
[array([[148, 124, 115],
        [150, 127, 116],
        [154, 129, 121],
        [159, 142, 133],
        [159, 142, 133],
        [161, 145, 142]]), array([[165, 136, 145],
        [176, 137, 141],
        [178, 138, 144],
        [199, 163, 171],
        [202, 163, 167],
        [200, 158, 163]])]

In [46]:

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM