如何将图像另存为h5py文件？

Question

I have a train folder. 我有一个火车文件夹。 It this folder there are 2000 images at different sizes . 该文件夹中有2000张不同尺寸的图像。 Also I have labels.csv file. 我也有labels.csv文件。 When training network, loading and resizing this images is time consuming. 训练网络时，加载和调整图像大小非常耗时。 So I have read some papers about h5py which is solution for this situation. 因此，我阅读了一些有关h5py的论文，这是解决这种情况的方法。 I tried the following code : 我尝试了以下代码：

PATH = os.path.abspath(os.path.join('Data'))
SOURCE_IMAGES = os.path.join(PATH, "Train")
print "[INFO] images paths reading"
images = glob(os.path.join(SOURCE_IMAGES, "*.jpg"))
images.sort()
print "[INFO] image labels reading"
labels = pd.read_csv('Data/labels.csv')

train_labels=[]

for i in range(len(labels["car"])):

    if(labels["car"][i]==1.0):

        train_labels.append(1.0)
    else:

        train_labels.append(0.0)

data_order = 'tf' 

if data_order == 'th':
    train_shape = (len(images), 3, 224, 224)
else:
    train_shape = (len(images), 224, 224, 3
print "[INFO] h5py file created"

hf=h5py.File('data.hdf5', 'w')

hf.create_dataset("train_img",
                  shape=train_shape,
                  maxshape=train_shape,
                  compression="gzip",
                  compression_opts=9)

hf.create_dataset("train_labels",
            shape=(len(train_labels),),
            maxshape=(None,),
            compression="gzip",
            compression_opts=9)

hf["train_labels"][...] = train_labels


print "[INFO] read and size images"
for i,addr in enumerate(images):

    s=dt.datetime.now()
    img = cv2.imread(images[i])
    img = cv2.resize(img, (224, 224), interpolation=cv2.INTER_CUBIC)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

    hf["train_img"][i, ...] = img[None]
    e=dt.datetime.now()
    print "[INFO] image",str(i),"is saved time:", e-s, "second"

hf.close()

But when I run this code. 但是当我运行这段代码时。 Code is running hours. 代码运行时间。 At first it is very fast but later reading is very slow, especially at this line hf["train_img"][i, ...] = img[None]. 起初它非常快，但后来阅读却很慢，尤其是在这行hf [“ train_img”] [i，...] = img [None]。 Here output of this program. 此程序的输出。 As you can see, time is constantly increasing. 如您所见，时间在不断增加。 Where am I doing wrong? 我在哪里做错了？ Thanks for advises. 感谢您的建议。

Answer 1

train_img is created with compression_opts=9 . train_img是使用compression_opts=9创建的。 This is the highest compression level, taking the most work to compress/decompress. 这是最高的压缩级别，需要最多的工作来进行压缩/解压缩。

If the time of compressing the image is a bottleneck and you can trade that off for some space taken, use a lower compression level, like the default ( =4 ). 如果压缩图像的时间是一个瓶颈，并且您可以在已占用的空间上进行权衡，请使用较低的压缩级别，例如默认值（ =4 ）。 Or even disable the compression completely. 甚至完全禁用压缩。

如何将图像另存为h5py文件？

问题描述

1 个解决方案

解决方案1
1 2018-07-27 10:16:23

如何将图像另存为h5py文件？

问题描述

1 个解决方案

解决方案1 1 2018-07-27 10:16:23

解决方案1
1 2018-07-27 10:16:23