简体   繁体   中英

Failed to find HDF5 dataset data - Single Label Regression Using Caffe and HDF5 data

I was using @shai 's code for creating my hdf5 file which is available here:

Test labels for regression caffe, float not allowed?

My data is grayscale images (1000 images for a start) and label is one dimensional float numbers

So, I modified his code as:

import h5py, os
import caffe
import numpy as np

SIZE = 224 
with open( 'train.txt', 'r' ) as T :
    lines = T.readlines()
X = np.zeros( (len(lines), 1, SIZE, SIZE), dtype='f4' )  #Changed 3 to 1
y = np.zeros( (len(lines)), dtype='f4' ) #Removed the "1,"
for i,l in enumerate(lines):
    sp = l.split(' ')
    img = caffe.io.load_image( sp[0], color=False ) #Added , color=False
    img = caffe.io.resize( img, (SIZE, SIZE, 1) )   #Changed 3 to 1
    # you may apply other input transformations here...
    X[i] = img
    y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
    H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
    H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
    L.write( 'train.h5' ) # list all h5 files you are going to use

But I was getting this error:

ValueError                                Traceback (most recent call last)
<ipython-input-19-8148f7b9e03d> in <module>()
     13     img = caffe.io.resize( img, (SIZE, SIZE, 1) )   #Changed 3 to 1
     14     # you may apply other input transformations here...
---> 15     X[i] = img
     16     y[i] = float(sp[1])

ValueError: could not broadcast input array from shape (224,224,1) into shape (1,224,224)

So I changed the line number 13 from:

img = caffe.io.resize( img, (SIZE, SIZE, 1) ) 

to:

img = caffe.io.resize( img, (1, SIZE, SIZE) ) 

And the code ran fine.

For training, I used this solver.prototxt file:

net: "MyCaffeTrain/train_test.prototxt"

# Note: 1 iteration = 1 forward pass over all the images in one batch

# Carry out a validation test every 500 training iterations.
test_interval: 500 

# test_iter specifies how many forward passes the validation test should carry out
#  a good number is num_val_imgs / batch_size (see batch_size in Data layer in phase TEST in train_test.prototxt)
test_iter: 100 

# The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9 
weight_decay: 0.0005

# We want to initially move fast towards the local minimum and as we approach it, we want to move slower
# To this end, there are various learning rates policies available:
#  fixed: always return base_lr.
#  step: return base_lr * gamma ^ (floor(iter / step))
#  exp: return base_lr * gamma ^ iter
#  inv: return base_lr * (1 + gamma * iter) ^ (- power)
#  multistep: similar to step but it allows non uniform steps defined by stepvalue
#  poly: the effective learning rate follows a polynomial decay, to be zero by the max_iter: return base_lr (1 - iter/max_iter) ^ (power)
#  sigmoid: the effective learning rate follows a sigmod decay: return base_lr * ( 1/(1 + exp(-gamma * (iter - stepsize))))
lr_policy: "inv"
gamma: 0.0001
power: 0.75 
#stepsize: 10000 # Drop the learning rate in steps by a factor of gamma every stepsize iterations

# Display every 100 iterations
display: 100 

# The maximum number of iterations
max_iter: 10000

# snapshot intermediate results, that is, every 5000 iterations it saves a snapshot of the weights
snapshot: 5000
snapshot_prefix: "MyCaffeTrain/lenet_multistep"

# solver mode: CPU or GPU
solver_mode: CPU

And my train_test.prototxt file is:

name: "LeNet"
layer {
  name: "mnist"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "MyCaffeTrain/train_h5_list.txt"
    batch_size: 1000
  }
  include: { phase: TRAIN }
}

layer {
  name: "mnist"
  type: "HDF5Data"
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "MyCaffeTrain/test_h5_list.txt"
    batch_size: 1000
  }
  include: { phase: TEST }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

But when I train, I get this error:

I0914 13:59:33.198423  8251 layer_factory.hpp:74] Creating layer mnist
I0914 13:59:33.198452  8251 net.cpp:96] Creating Layer mnist
I0914 13:59:33.198467  8251 net.cpp:415] mnist -> data
I0914 13:59:33.198510  8251 net.cpp:415] mnist -> label
I0914 13:59:33.198532  8251 net.cpp:160] Setting up mnist
I0914 13:59:33.198549  8251 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: MyCaffeTrain/train_h5_list.txt
I0914 13:59:33.198884  8251 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
F0914 13:59:33.200848  8251 io.cpp:237] Check failed: H5LTfind_dataset(file_id, dataset_name_) Failed to find HDF5 dataset data
*** Check failure stack trace: ***
    @     0x7fcfa9fb05cd  google::LogMessage::Fail()
    @     0x7fcfa9fb2433  google::LogMessage::SendToLog()
    @     0x7fcfa9fb015b  google::LogMessage::Flush()
    @     0x7fcfa9fb2e1e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fcfaa426b13  caffe::hdf5_load_nd_dataset_helper<>()
    @     0x7fcfaa423ec5  caffe::hdf5_load_nd_dataset<>()
    @     0x7fcfaa34bd3d  caffe::HDF5DataLayer<>::LoadHDF5FileData()
    @     0x7fcfaa345ae6  caffe::HDF5DataLayer<>::LayerSetUp()
    @     0x7fcfaa3fdd75  caffe::Net<>::Init()
    @     0x7fcfaa4001ff  caffe::Net<>::Net()
    @     0x7fcfaa40b935  caffe::Solver<>::InitTrainNet()
    @     0x7fcfaa40cd6e  caffe::Solver<>::Init()
    @     0x7fcfaa40cf36  caffe::Solver<>::Solver()
    @           0x411980  caffe::GetSolver<>()
    @           0x4093a6  train()
    @           0x406de0  main
    @     0x7fcfa9049830  __libc_start_main
    @           0x407319  _start
    @              (nil)  (unknown)

I have tried my best but still cannot find the reason behind this error. Is my created database in correct format? I read somewhere that the data format should be:

N, C, H, W (No. of Data, Channels, Height, Width) 
For my case: 1000,1,224,224

on checking X.shape I get the same result : 1000,1,224,224

I am not getting where I am doing wrong. Any help would be appreciated. Thanks in advance.

I solved the problem making following changes to the code:

H.create_dataset( 'data', data=X ) # note the name X given to the dataset! Replaced X by data
H.create_dataset( 'label', data=y ) # note the name y given to the dataset! Replaced y by label

The error was gone.

I'm still having the problem with EuclideanLoss layer though I'll look onto it and post another question if required.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM