简体   繁体   English

caffe 错误 == cudaSuccess (77 vs. 0) 遇到非法内存访问

[英]caffe error == cudaSuccess (77 vs. 0) an illegal memory access was encountered

Recently,I need to change the code of softmax_loss_layer in caffe.最近需要修改caffe中softmax_loss_layer的代码。 The caffe version is: https://github.com/BVLC/caffe caffe 版本为: https : //github.com/BVLC/caffe

The code between ... is my code,and the rest is the same.Here is how I modified the code First,I added rs_ in loss_layers.hpp ...之间的代码是我的代码,其余的都是一样的。这里是我修改代码的方法 首先,我在loss_layers.hpp中添加了rs_

class SoftmaxWithLossLayer : public LossLayer<Dtype> {
Blob<Dtype> rs_;

Then,I reshape rs_ in softmax_loss_layer.cpp然后,我在 softmax_loss_layer.cpp 中重塑 rs_

void SoftmaxWithLossLayer<Dtype>::LayerSetUp(
const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {

Bottom[2] is from dense_image_data_layer.cpp底部[2]来自dense_image_data_layer.cpp

template <typename Dtype>
void DenseImageDataLayer<Dtype>::DataLayerSetUp(constvector<Blob<Dtype>*>& bottom,
  const vector<Blob<Dtype>*>& top) {
top[2]->Reshape(batch_size, 1, height, width);
this->prefetch_weight_.Reshape(batch_size, 1, height, width);
this->transformed_weight_.Reshape(1, 1, height, width);
Dtype* top_data = top[2]->mutable_cpu_data();
const int count = top[2]->count();
if(crop_size > 0) {
for(int index = 0;index < count;++index)
const int ww = index % crop_size;
const int wh = (index / crop_size) % crop_size;
for(int index = 0;index < count;++index)
const int ww = index % width;
const int wh = (index / width) % height;

At last,I added this code in softmax_loss_layer.cu最后,我在 softmax_loss_layer.cu 中添加了这段代码

template <typename Dtype>
void SoftmaxWithLossLayer<Dtype>::Forward_gpu(
const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {
Dtype loss;
Dtype em;
bool add_weight=this->layer_param_.loss_param().add_weight();
Dtype* weight=bottom[2]->mutable_gpu_data();
Dtype z=0;
Dtype* rs=rs_.mutable_gpu_data();
  CAFFE_CUDA_NUM_THREADS>>>(nthreads, prob_data, label, 
  weight_by_label_freqs_, label_count_data , loss_data, 
  outer_num_, dim, inner_num_, 
  has_ignore_label_, ignore_label_, counts,weight,rs);
const Dtype*rs1=rs_.gpu_data();
caffe_gpu_asum(nthreads, loss_data, &loss);
Dtype am=1/2*log((1-em)/em);

template <typename Dtype>
void SoftmaxWithLossLayer<Dtype>::Backward_gpu(const    vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, const  vector<Blob<Dtype>*>& bottom) {
const Dtype* weight=bottom[2]->gpu_data();
bool add_weight=this->layer_param_.loss_param().add_weight();
    CAFFE_CUDA_NUM_THREADS>>>(nthreads, top_data, label,
    weight_by_label_freqs_, label_count_data, bottom_diff, 
    outer_num_, dim, inner_num_, has_ignore_label_, 
    ignore_label_, counts,weight);

The error is:错误是:

F0108 10:45:48.291290  2859 math_functions.cu:81] Check failed: error ==     cudaSuccess (77 vs. 0)  an illegal memory access was encountered
*** Check failure stack trace: ***
@     0x7f0b3bcfbdaa  (unknown)
@     0x7f0b3bcfbce4  (unknown)
@     0x7f0b3bcfb6e6  (unknown)
@     0x7f0b3bcfe687  (unknown)
@     0x7f0b3c16dd48  caffe::caffe_gpu_memcpy()
@     0x7f0b3c09b67e  caffe::SyncedMemory::gpu_data()
@     0x7f0b3c054472  caffe::Blob<>::gpu_data()
@     0x7f0b3c0b0568  caffe::Net<>::ForwardFromTo()
@     0x7f0b3c0b0947  caffe::Net<>::ForwardPrefilled()
@     0x7f0b3c08e555  caffe::Solver<>::Step()
@     0x7f0b3c08ee8f  caffe::Solver<>::Solve()
@           0x407806  train()
@           0x405d41  main
@     0x7f0b3b20dec5  (unknown)
@           0x4062ed  (unknown)
@              (nil)  (unknown)

I think that this error is because I didn't use rs_ correctly or bottom[2] is not right.我认为这个错误是因为我没有正确使用 rs_ 或者 bottom[2] 不正确。 All the code that I changed is posted above,so can anybody tell me what to do?我更改的所有代码都已在上面发布,所以有人可以告诉我该怎么做吗? If you need extra information,please tell me?如果您需要更多信息,请告诉我?

It is difficult to follow your code and changes.很难遵循您的代码和更改。
Some comments;一些评论;

  1. reshape ing rs_ should occur in reshape() method, rather than in setup() . reshape rs_应该发生在reshape()方法中,而不是在setup()

  2. It is best using rs_.ReshapeLike(*bottom[2]) than explicitly enumerating num , channels etc. What if you are going to have a Blob with different number of dimensions?最好使用rs_.ReshapeLike(*bottom[2])不是显式枚举numchannels等。如果您要拥有不同维数的 Blob 怎么办?

  3. Have you tested your modified layer?您是否测试过修改后的图层? From caffe wiki :来自咖啡馆维基

    Write tests in test/test_your_layer.cpp .test/test_your_layer.cpp编写测试。 Use test/test_gradient_check_util.hpp to check that your Forward and Backward implementations are in numerical agreement.使用test/test_gradient_check_util.hpp来检查您的 Forward 和 Backward 实现在数字上是否一致。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM