损失不收敛Caffe回归

Question

I'm doing regression in Caffe. 我正在Caffe中进行回归。 The dataset is 400 RGB images of 128x128 size and label contains float numbers in range(-1,1). 数据集是400张RGB图片，尺寸为128x128，标签包含的浮点数范围为（-1,1）。 The only transformation I applied to the dataset was Normalization (Divided every pixel value in RGB by 255). 我应用于数据集的唯一变换是归一化（将RGB中的每个像素值除以255）。 But the loss doesn't seem to converge at all. 但是损失似乎并没有完全收敛。

What might be the possible reason for this? 这可能是什么原因？ Can anyone please suggest me? 有人可以建议我吗？

Here is my training log: 这是我的训练日志：

Training..
Using solver: solver_hdf5.prototxt
I0929 21:50:21.657784 13779 caffe.cpp:112] Use CPU.
I0929 21:50:21.658033 13779 caffe.cpp:174] Starting Optimization
I0929 21:50:21.658107 13779 solver.cpp:34] Initializing solver from parameters: 
test_iter: 100
test_interval: 500
base_lr: 0.0001
display: 25
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "lenet_hdf5"
solver_mode: CPU
net: "train_test_hdf5.prototxt"
I0929 21:50:21.658143 13779 solver.cpp:75] Creating training net from net file: train_test_hdf5.prototxt
I0929 21:50:21.658567 13779 net.cpp:334] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0929 21:50:21.658709 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression"
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "train_hdf5file.txt"
    batch_size: 64
    shuffle: true
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "dropout1"
  type: "Dropout"
  bottom: "pool1"
  top: "pool1"
  dropout_param {
    dropout_ratio: 0.1
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "dropout2"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}
I0929 21:50:21.658833 13779 layer_factory.hpp:74] Creating layer data
I0929 21:50:21.658859 13779 net.cpp:96] Creating Layer data
I0929 21:50:21.658871 13779 net.cpp:415] data -> data
I0929 21:50:21.658902 13779 net.cpp:415] data -> label
I0929 21:50:21.658926 13779 net.cpp:160] Setting up data
I0929 21:50:21.658936 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: train_hdf5file.txt
I0929 21:50:21.659220 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0929 21:50:21.920578 13779 net.cpp:167] Top shape: 64 3 128 128 (3145728)
I0929 21:50:21.920656 13779 net.cpp:167] Top shape: 64 1 (64)
I0929 21:50:21.920686 13779 layer_factory.hpp:74] Creating layer conv1
I0929 21:50:21.920740 13779 net.cpp:96] Creating Layer conv1
I0929 21:50:21.920774 13779 net.cpp:459] conv1 <- data
I0929 21:50:21.920825 13779 net.cpp:415] conv1 -> conv1
I0929 21:50:21.920877 13779 net.cpp:160] Setting up conv1
I0929 21:50:21.921985 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280)
I0929 21:50:21.922050 13779 layer_factory.hpp:74] Creating layer relu1
I0929 21:50:21.922085 13779 net.cpp:96] Creating Layer relu1
I0929 21:50:21.922108 13779 net.cpp:459] relu1 <- conv1
I0929 21:50:21.922137 13779 net.cpp:404] relu1 -> conv1 (in-place)
I0929 21:50:21.922185 13779 net.cpp:160] Setting up relu1
I0929 21:50:21.922227 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280)
I0929 21:50:21.922250 13779 layer_factory.hpp:74] Creating layer pool1
I0929 21:50:21.922277 13779 net.cpp:96] Creating Layer pool1
I0929 21:50:21.922298 13779 net.cpp:459] pool1 <- conv1
I0929 21:50:21.922323 13779 net.cpp:415] pool1 -> pool1
I0929 21:50:21.922418 13779 net.cpp:160] Setting up pool1
I0929 21:50:21.922472 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320)
I0929 21:50:21.922495 13779 layer_factory.hpp:74] Creating layer dropout1
I0929 21:50:21.922534 13779 net.cpp:96] Creating Layer dropout1
I0929 21:50:21.922555 13779 net.cpp:459] dropout1 <- pool1
I0929 21:50:21.922582 13779 net.cpp:404] dropout1 -> pool1 (in-place)
I0929 21:50:21.922613 13779 net.cpp:160] Setting up dropout1
I0929 21:50:21.922652 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320)
I0929 21:50:21.922672 13779 layer_factory.hpp:74] Creating layer fc1
I0929 21:50:21.922709 13779 net.cpp:96] Creating Layer fc1
I0929 21:50:21.922729 13779 net.cpp:459] fc1 <- pool1
I0929 21:50:21.922757 13779 net.cpp:415] fc1 -> fc1
I0929 21:50:21.922801 13779 net.cpp:160] Setting up fc1
I0929 21:50:22.301134 13779 net.cpp:167] Top shape: 64 500 (32000)
I0929 21:50:22.301193 13779 layer_factory.hpp:74] Creating layer dropout2
I0929 21:50:22.301210 13779 net.cpp:96] Creating Layer dropout2
I0929 21:50:22.301218 13779 net.cpp:459] dropout2 <- fc1
I0929 21:50:22.301232 13779 net.cpp:404] dropout2 -> fc1 (in-place)
I0929 21:50:22.301244 13779 net.cpp:160] Setting up dropout2
I0929 21:50:22.301254 13779 net.cpp:167] Top shape: 64 500 (32000)
I0929 21:50:22.301259 13779 layer_factory.hpp:74] Creating layer fc2
I0929 21:50:22.301270 13779 net.cpp:96] Creating Layer fc2
I0929 21:50:22.301275 13779 net.cpp:459] fc2 <- fc1
I0929 21:50:22.301285 13779 net.cpp:415] fc2 -> fc2
I0929 21:50:22.301295 13779 net.cpp:160] Setting up fc2
I0929 21:50:22.301317 13779 net.cpp:167] Top shape: 64 1 (64)
I0929 21:50:22.301328 13779 layer_factory.hpp:74] Creating layer loss
I0929 21:50:22.301338 13779 net.cpp:96] Creating Layer loss
I0929 21:50:22.301343 13779 net.cpp:459] loss <- fc2
I0929 21:50:22.301350 13779 net.cpp:459] loss <- label
I0929 21:50:22.301360 13779 net.cpp:415] loss -> loss
I0929 21:50:22.301374 13779 net.cpp:160] Setting up loss
I0929 21:50:22.301385 13779 net.cpp:167] Top shape: (1)
I0929 21:50:22.301391 13779 net.cpp:169]     with loss weight 1
I0929 21:50:22.301419 13779 net.cpp:239] loss needs backward computation.
I0929 21:50:22.301425 13779 net.cpp:239] fc2 needs backward computation.
I0929 21:50:22.301430 13779 net.cpp:239] dropout2 needs backward computation.
I0929 21:50:22.301436 13779 net.cpp:239] fc1 needs backward computation.
I0929 21:50:22.301441 13779 net.cpp:239] dropout1 needs backward computation.
I0929 21:50:22.301446 13779 net.cpp:239] pool1 needs backward computation.
I0929 21:50:22.301452 13779 net.cpp:239] relu1 needs backward computation.
I0929 21:50:22.301457 13779 net.cpp:239] conv1 needs backward computation.
I0929 21:50:22.301463 13779 net.cpp:241] data does not need backward computation.
I0929 21:50:22.301468 13779 net.cpp:282] This network produces output loss
I0929 21:50:22.301482 13779 net.cpp:531] Collecting Learning Rate and Weight Decay.
I0929 21:50:22.301491 13779 net.cpp:294] Network initialization done.
I0929 21:50:22.301496 13779 net.cpp:295] Memory required for data: 209652228
I0929 21:50:22.301908 13779 solver.cpp:159] Creating test net (#0) specified by net file: train_test_hdf5.prototxt
I0929 21:50:22.301935 13779 net.cpp:334] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0929 21:50:22.302028 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression"
state {
  phase: TEST
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  hdf5_data_param {
    source: "test_hdf5file.txt"
    batch_size: 30
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "dropout1"
  type: "Dropout"
  bottom: "pool1"
  top: "pool1"
  dropout_param {
    dropout_ratio: 0.1
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "dropout2"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
}
I0929 21:50:22.302146 13779 layer_factory.hpp:74] Creating layer data
I0929 21:50:22.302158 13779 net.cpp:96] Creating Layer data
I0929 21:50:22.302165 13779 net.cpp:415] data -> data
I0929 21:50:22.302176 13779 net.cpp:415] data -> label
I0929 21:50:22.302186 13779 net.cpp:160] Setting up data
I0929 21:50:22.302191 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: test_hdf5file.txt
I0929 21:50:22.302305 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0929 21:50:22.434798 13779 net.cpp:167] Top shape: 30 3 128 128 (1474560)
I0929 21:50:22.434849 13779 net.cpp:167] Top shape: 30 1 (30)
I0929 21:50:22.434864 13779 layer_factory.hpp:74] Creating layer conv1
I0929 21:50:22.434895 13779 net.cpp:96] Creating Layer conv1
I0929 21:50:22.434914 13779 net.cpp:459] conv1 <- data
I0929 21:50:22.434944 13779 net.cpp:415] conv1 -> conv1
I0929 21:50:22.434996 13779 net.cpp:160] Setting up conv1
I0929 21:50:22.435084 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600)
I0929 21:50:22.435119 13779 layer_factory.hpp:74] Creating layer relu1
I0929 21:50:22.435205 13779 net.cpp:96] Creating Layer relu1
I0929 21:50:22.435237 13779 net.cpp:459] relu1 <- conv1
I0929 21:50:22.435292 13779 net.cpp:404] relu1 -> conv1 (in-place)
I0929 21:50:22.435328 13779 net.cpp:160] Setting up relu1
I0929 21:50:22.435371 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600)
I0929 21:50:22.435400 13779 layer_factory.hpp:74] Creating layer pool1
I0929 21:50:22.435443 13779 net.cpp:96] Creating Layer pool1
I0929 21:50:22.435470 13779 net.cpp:459] pool1 <- conv1
I0929 21:50:22.435511 13779 net.cpp:415] pool1 -> pool1
I0929 21:50:22.435550 13779 net.cpp:160] Setting up pool1
I0929 21:50:22.435597 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400)
I0929 21:50:22.435626 13779 layer_factory.hpp:74] Creating layer dropout1
I0929 21:50:22.435669 13779 net.cpp:96] Creating Layer dropout1
I0929 21:50:22.435698 13779 net.cpp:459] dropout1 <- pool1
I0929 21:50:22.435739 13779 net.cpp:404] dropout1 -> pool1 (in-place)
I0929 21:50:22.435780 13779 net.cpp:160] Setting up dropout1
I0929 21:50:22.435823 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400)
I0929 21:50:22.435853 13779 layer_factory.hpp:74] Creating layer fc1
I0929 21:50:22.435899 13779 net.cpp:96] Creating Layer fc1
I0929 21:50:22.435926 13779 net.cpp:459] fc1 <- pool1
I0929 21:50:22.435971 13779 net.cpp:415] fc1 -> fc1
I0929 21:50:22.436018 13779 net.cpp:160] Setting up fc1
I0929 21:50:22.816076 13779 net.cpp:167] Top shape: 30 500 (15000)
I0929 21:50:22.816138 13779 layer_factory.hpp:74] Creating layer dropout2
I0929 21:50:22.816154 13779 net.cpp:96] Creating Layer dropout2
I0929 21:50:22.816160 13779 net.cpp:459] dropout2 <- fc1
I0929 21:50:22.816170 13779 net.cpp:404] dropout2 -> fc1 (in-place)
I0929 21:50:22.816182 13779 net.cpp:160] Setting up dropout2
I0929 21:50:22.816192 13779 net.cpp:167] Top shape: 30 500 (15000)
I0929 21:50:22.816197 13779 layer_factory.hpp:74] Creating layer fc2
I0929 21:50:22.816208 13779 net.cpp:96] Creating Layer fc2
I0929 21:50:22.816249 13779 net.cpp:459] fc2 <- fc1
I0929 21:50:22.816262 13779 net.cpp:415] fc2 -> fc2
I0929 21:50:22.816277 13779 net.cpp:160] Setting up fc2
I0929 21:50:22.816301 13779 net.cpp:167] Top shape: 30 1 (30)
I0929 21:50:22.816316 13779 layer_factory.hpp:74] Creating layer loss
I0929 21:50:22.816329 13779 net.cpp:96] Creating Layer loss
I0929 21:50:22.816337 13779 net.cpp:459] loss <- fc2
I0929 21:50:22.816347 13779 net.cpp:459] loss <- label
I0929 21:50:22.816359 13779 net.cpp:415] loss -> loss
I0929 21:50:22.816370 13779 net.cpp:160] Setting up loss
I0929 21:50:22.816381 13779 net.cpp:167] Top shape: (1)
I0929 21:50:22.816388 13779 net.cpp:169]     with loss weight 1
I0929 21:50:22.816407 13779 net.cpp:239] loss needs backward computation.
I0929 21:50:22.816416 13779 net.cpp:239] fc2 needs backward computation.
I0929 21:50:22.816426 13779 net.cpp:239] dropout2 needs backward computation.
I0929 21:50:22.816433 13779 net.cpp:239] fc1 needs backward computation.
I0929 21:50:22.816442 13779 net.cpp:239] dropout1 needs backward computation.
I0929 21:50:22.816452 13779 net.cpp:239] pool1 needs backward computation.
I0929 21:50:22.816460 13779 net.cpp:239] relu1 needs backward computation.
I0929 21:50:22.816468 13779 net.cpp:239] conv1 needs backward computation.
I0929 21:50:22.816478 13779 net.cpp:241] data does not need backward computation.
I0929 21:50:22.816486 13779 net.cpp:282] This network produces output loss
I0929 21:50:22.816500 13779 net.cpp:531] Collecting Learning Rate and Weight Decay.
I0929 21:50:22.816510 13779 net.cpp:294] Network initialization done.
I0929 21:50:22.816517 13779 net.cpp:295] Memory required for data: 98274484
I0929 21:50:22.816565 13779 solver.cpp:47] Solver scaffolding done.
I0929 21:50:22.816587 13779 solver.cpp:363] Solving MSE regression
I0929 21:50:22.816596 13779 solver.cpp:364] Learning Rate Policy: inv
I0929 21:50:22.870337 13779 solver.cpp:424] Iteration 0, Testing net (#0)

Update (This is after the @lejlot 's reply) 更新（在@lejlot的回复之后）

Training Images After Changing My Data: 更改数据后训练图像：

Answer 1

It seems to be learning, the loss goes down. 似乎正在学习，损失减少了。 However there is clearly something wrong with your data. 但是，您的数据显然存在问题。 Before learning (iteration 0) you already have a loss 0.0006. 在学习之前（迭代0），您已经损失了0.0006。 This is extremely small loss for a random model . 对于随机模型而言，这是非常小的损失。 Thus it looks like your data is very odd. 因此，看来您的数据很奇怪。 Look at your dependent values, are they really nicely distributed between -1 and 1? 看一下您的依赖值，它们是否真的在-1和1之间很好地分布？ Or is it like having 99% of "0" and just a few other values? 还是就像有99％的“ 0”和其他几个值一样？ There is nothing wrong with the approach itself, you simply need to do more analysis of your data. 该方法本身没有错，您只需要对数据进行更多分析。 Make sure that it actually nicely spans [-1, 1] interval. 确保实际上跨度为[-1，1]。 Once you fix it, there will be plenty more small things to play around - but this is the biggest issue right now - you get way to small error with random model , thus the problem is data, not algorithm/method/parameters. 修复它之后，将会有更多的小事情要解决-但这是当前最大的问题- 随机模型会导致小错误，因此问题出在数据上，而不是算法/方法/参数上。 To make things go faster, you can also increase the learning rate from 0.0001 that you are using right now, but as said before - first fix the data. 为了使事情进展得更快，您还可以将当前使用的学习率从0.0001提高，但是如前所述-首先修复数据。

损失不收敛Caffe回归

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-10-01 09:43:49

损失不收敛Caffe回归

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-10-01 09:43:49

解决方案1
1 已采纳 2016-10-01 09:43:49