简体   繁体   English

C ++段故障问题

[英]c++ seg fault issue

I am working on a C++ program that uses some external C libraries. 我正在使用一些外部C库的C ++程序上工作。 As far as I can tell though that is not the cause of the problem, and the issue is with my C++ code. 据我所知,这不是问题的原因,而问题出在我的C ++代码上。 The program runs fine with no errors or anything on my test datasets, but after going through nearly the entire full dataset, I get a segfault. 该程序运行良好,没有错误,也没有测试数据集上的任何内容,但是经过几乎整个完整数据集后,出现了段错误。 Running GDB gives me this segfault: 运行GDB给了我这个段错误:

(gdb) run -speciesMain=allMis1 -speciesOther=anoCar2 -speciesMain=allMis1 -speciesOther=anoCar2 /hive/data/genomes/allMis1/bed/lastz.anoCar2/mafRBestNet/*.maf.gz
Starting program: /cluster/home/jstjohn/bin/mafPairwiseSyntenyDecay -speciesMain=allMis1 -speciesOther=anoCar2 -speciesMain=allMis1 -speciesOther=anoCar2 /hive/data/genome
s/allMis1/bed/lastz.anoCar2/mafRBestNet/*.maf.gz
Detaching after fork from child process 3718.

Program received signal SIGSEGV, Segmentation fault.
0x0000003009cb7672 in __gnu_cxx::__exchange_and_add(int volatile*, int) () from /usr/lib64/libstdc++.so.6
(gdb) up
#1  0x0000003009c9db59 in std::basic_string, std::allocator >::~basic_string() () from /usr/lib64/libstdc++.so.6
(gdb) up
#2  0x00000000004051e7 in PairAlnInfo::~PairAlnInfo (this=0x7fffffffcd70, __in_chrg=) at mafPairwiseSyntenyDecay.cpp:37
(gdb) up
#3  0x0000000000404eb0 in main (argc=2, argv=0x7fffffffcf78) at mafPairwiseSyntenyDecay.cpp:260

It looks like something is going on with a double free of my PairAlnInfo class. 好像我的PairAlnInfo类没有空了。 The weird thing is that I don't define a destructor, and I am not allocating anything with new . 奇怪的是,我没有定义析构函数,也没有使用new分配任何东西。 I have tried this both with g++44 and g++4.1.2 on the linux machine and have had the same results. 我已经在Linux机器上使用g ++ 44和g ++ 4.1.2进行了尝试,并且得到了相同的结果。

To make things even weirder, on my linux box (with more available RAM and everything, not that RAM is an issue with this program, but it is a beefy system) the seg fault happens as described above before the program reaches the loop to print output. 使事情变得更奇怪的是,在我的Linux机器上(具有更多可用的RAM和所有东西,不是该程序有问题的RAM,但这是一个强大的系统),在程序到达循环打印之前,如上所述发生了段错误输出。 On my much smaller macbook air using either g++ or clang++, the program still segfaults, but it doesn't do that until after the results are printed, right before the final return(0) out of the main function. 在使用或者G ++或铛++我小很多的MacBook Air,该程序仍出现segfaults,但它并没有这样做,直到结果打印 ,权前的最后return(0)出来的主要功能。 Here is what the GDB trace looks like on my mac running on the same file after compiling with Mac's default g++4.2: 这是在使用Mac的默认g ++ 4.2编译后,在同一个文件上运行的我的Mac上的GDB跟踪结果:

(more results)...
98000   27527   162181  0.83027
99000   27457   161467  0.829953
100000  27411   160794  0.829527

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00004a2c00106077
0x00007fff9365a6e5 in std::string::_Rep::_M_dispose ()
(gdb) up
#1  0x00007fff9365a740 in std::basic_string, std::allocator >::~basic_string ()
(gdb) up
#2  0x0000000100003938 in main (argc=1261, argv=0x851d5fbff533) at mafPairwiseSyntenyDecay.cpp:301
(gdb)

Just in case you didn't notice the time of my posting, it's about 2:30AM now... I have been hacking away at this problem for about 10 hours now. 以防万一您没注意到我发帖的时间,现在是大约2:30 AM ...我已经解决了大约10个小时了。 Thanks so much for taking the time to look at this and help me out! 非常感谢您抽出宝贵时间来研究此问题并为我提供帮助! The code and some instructions for replicating my situation follow. 代码和复制我的情况的一些说明如下。

If you are interested in downloading and installing the whole thing with dependencies then download my KentLib repository, make in the base directory, and then go to examples/mafPairwiseSyntenyDecay and run make there. 如果您有兴趣下载并安装带有依赖项的整个程序,请下载我的KentLib存储库,在基本目录中make ,然后转到examples/mafPairwiseSyntenyDecay并在其中运行make An example (rather large) that causes the bug I am discussing is the gziped file available here: 100Mb file that the program crashes on . 导致我正在讨论的错误的一个示例(相当大)是这里提供的gziped文件: 程序崩溃的100Mb文件 Then execute the program with these arguments -speciesMain=allMis1 -speciesOther=anoCar2 anoCar2.allMis1.rbest.maf.gz . 然后使用以下参数-speciesMain=allMis1 -speciesOther=anoCar2 anoCar2.allMis1.rbest.maf.gz执行程序。

/**
 * mafPairwiseSyntenyDecay
 *  Author: John St. John
 *  Date: 4/26/2012
 *  
 *  calculates the mean synteny decay in different range bins
 *
 *
 */

//Kent source C imports
extern "C" {

#include "common.h"
#include "options.h"
#include "maf.h"

}

#include <map>
#include <string>
#include <set>
#include <vector>
#include <sstream>
#include <iostream>

//#define NDEBUG
#include <assert.h>

using namespace std;


/*
Global variables
 */

class PairAlnInfo {
public:
  string oname;
  int sstart;
  int send;
  int ostart;
  int oend;
  char strand;
  PairAlnInfo(string _oname,
      int _sstart, int _send,
      int _ostart, int _oend,
      char _strand):
        oname(_oname),
        sstart(_sstart),
        send(_send),
        ostart(_ostart),
        oend(_oend),
        strand(_strand){}
  PairAlnInfo():
    oname("DUMMY"),
    sstart(-1),
    send(-1),
    ostart(-1),
    oend(-1),
    strand(-1){}

};

vector<string> &split(const string &s, char delim, vector<string> &elems) {
  stringstream ss(s);
  string item;
  while(getline(ss, item, delim)) {
    elems.push_back(item);
  }
  return(elems);
}


vector<string> split(const string &s, char delim) {
  vector<string> elems;
  return(split(s, delim, elems));
}

#define DEF_MIN_LEN (200)
#define DEF_MIN_SCORE (200)

typedef map<int,PairAlnInfo> PairAlnInfoByPos;
typedef map<string, PairAlnInfoByPos > ChromToPairAlnInfoByPos;
ChromToPairAlnInfoByPos pairAlnInfoByPosByChrom;


void usage()
/* Explain usage and exit. */
{
  errAbort(
      (char*)"mafPairwiseSyntenyDecay -- Calculates pairwise syntenic decay from maf alignment containing at least the two specified species.\n"
      "usage:\n"
      "\tmafPairwiseSyntenyDecay [options] [*required options] file1.maf[.gz] ... \n"
      "Options:\n"
      "\t-help\tPrints this message.\n"
      "\t-minScore=NUM\tMinimum MAF alignment score to consider (default 200)\n"
      "\t-minAlnLen=NUM\tMinimum MAF alignment block length to consider (default 200)\n"
      "\t-speciesMain=NAME\t*Name of the main species (exactly as it appears before the '.') in the maf file (REQUIRED)\n"
      "\t-speciesOther=NAME\t*Name of the other species (exactly as it appears before the '.') in the maf file (REQUIRED)\n"
  );
}//end usage()


static struct optionSpec options[] = {
    /* Structure holding command line options */
    {(char*)"help",OPTION_STRING},
    {(char*)"minScore",OPTION_INT},
    {(char*)"minAlnLen",OPTION_INT},
    {(char*)"speciesMain",OPTION_STRING},
    {(char*)"speciesOther",OPTION_STRING},
    {NULL, 0}
}; //end options()

/**
 * Main function, takes filenames for paired qseq reads
 * and outputs three files.
 */
int iterateOverAlignmentBlocksAndStorePairInfo(char *fileName, const int minScore, const int minAlnLen, const string speciesMain, const string speciesOther){
  struct mafFile * mFile = mafOpen(fileName);
  struct mafAli * mAli;

  //loop over alignment blocks
  while((mAli = mafNext(mFile)) != NULL){
    struct mafComp *first = mAli->components;
    int seqlen = mAli->textSize;
    //First find and store set of duplicates in this block
    set<string> seen;
    set<string> dups;
    if(mAli->score < minScore || seqlen < minAlnLen){
      //free here and pre-maturely end
      mafAliFree(&mAli);
      continue;
    }

    for(struct mafComp *item = first; item != NULL; item = item->next){
      string tmp(item->src);
      string tname = split(tmp,'.')[0];
      if(seen.count(tname)){
        //seen this item
        dups.insert(tname);
      }else{
        seen.insert(tname);
      }
    }
    for(struct mafComp *item1 = first; item1->next != NULL; item1 = item1->next){
      //stop one before the end
      string tmp1(item1->src);
      vector<string> nameSplit1(split(tmp1,'.'));
      string name1(nameSplit1[0]);
      if(dups.count(name1) || (name1 != speciesMain && name1 != speciesOther)){
        continue;
      }

      for(struct mafComp *item2 = item1->next; item2 != NULL; item2 = item2->next){
        string tmp2(item2->src);
        vector<string> nameSplit2(split(tmp2,'.'));
        string name2 = nameSplit2[0];
        if(dups.count(name2) || (name2 != speciesMain && name2 != speciesOther)){
          continue;
        }

        string chr1(nameSplit1[1]);
        string chr2(nameSplit2[1]);
        char strand;
        if(item1->strand == item2->strand)
          strand = '+';
        else
          strand = '-';

        int start1,end1,start2,end2;

        if(item1->strand == '+'){
          start1 = item1->start;
          end1 = start1 + item1->size;
        }else{
          end1 = item1->start;
          start1 = end1 - item1->size;
        }

        if(item2->strand == '+'){
          start2 = item2->start;
          end2 = start2+ item2->size;
        }else{
          end2 = item2->start;
          start2 = end2 - item2->size;
        }

        if(name1 == speciesMain){
          PairAlnInfo aln(chr2,start1,end1,start2,end2,strand);
          pairAlnInfoByPosByChrom[chr1][start1] = aln;
        }else{
          PairAlnInfo aln(chr1,start2,end2,start1,end1,strand);
          pairAlnInfoByPosByChrom[chr2][start2] = aln;
        }

      } //end loop over item2
    } //end loop over item1
    mafAliFree(&mAli);
  }//end loop over alignment blocks

  mafFileFree(&mFile);
  return(0);
}


int main(int argc, char *argv[])
/* Process command line. */
{
  optionInit(&argc, argv, options);
  if(optionExists((char*)"help") || argc <= 1){
    usage();
  }
  int minAlnScore = optionInt((char*)"minScore",DEF_MIN_SCORE);
  int minAlnLen = optionInt((char*)"minAlnLen",DEF_MIN_LEN);

  string speciesMain(optionVal((char*)"speciesMain",NULL));
  string speciesOther(optionVal((char*)"speciesOther",NULL));

  if(speciesMain.empty() || speciesOther.empty())
    usage();

  //load the relevant alignment info from the maf(s)
  for(int i = 1; i<argc; i++){
    iterateOverAlignmentBlocksAndStorePairInfo(argv[i], minAlnScore, minAlnLen, speciesMain, speciesOther);
  }

  const int blockSize = 1000;
  const int blockCount = 100;

  int totalWindows[blockCount] = {0};
  int containBreak[blockCount] = {0};

  //we want the fraction of windows of each size that contain a break
  //


  for(ChromToPairAlnInfoByPos::iterator mainChromItter = pairAlnInfoByPosByChrom.begin();
      mainChromItter != pairAlnInfoByPosByChrom.end();
      mainChromItter++){
    //process the alignments shared by this chromosome
    //note that map stores them sorted by begin position
    vector<int> keys;
    for(PairAlnInfoByPos::iterator posIter = mainChromItter->second.begin();
        posIter != mainChromItter->second.end();
        posIter++){
      keys.push_back(posIter->first);
    }

    for(int i = 0; i < keys.size(); i++){
      //first check for trivial window (ie our block)
      PairAlnInfo pi1 = mainChromItter->second[keys[i]];
      assert(pi1.send > pi1.sstart);
      assert(pi1.sstart == keys[i]);
      int numBucketsThisWindow = (pi1.send - pi1.sstart) / blockSize;
      for(int k = 0; k < numBucketsThisWindow && k < blockCount; k++)
        totalWindows[k]++;


      for(int j = i+1; j < keys.size(); j++){

        PairAlnInfo pi2 = mainChromItter->second[keys[j]];

        assert(pi2.sstart == keys[j]);
        assert(pi2.send > pi2.sstart);
        assert(pi2.sstart > pi1.sstart);

        if(pi2.oname == pi1.oname){
          int moreToInc = (pi2.send - pi1.sstart) / blockSize;
          for(int k = numBucketsThisWindow; k < moreToInc && k < blockCount; k++)
            totalWindows[k]++;
          numBucketsThisWindow = moreToInc; //so we don't double count
        }else{

          int numDiscontigBuckets = (pi2.send - pi1.sstart) / blockSize;
          for(int k = numBucketsThisWindow; k < numDiscontigBuckets && k < blockSize; k++){
            containBreak[k]++;
            totalWindows[k]++;
          }
          numBucketsThisWindow = numDiscontigBuckets;
        }
        if((keys[j] - keys[i]) >= (blockSize * blockCount)){
          //i = j;
          break;
        }
      }
    }
  }



  cout << "#WindowSize\tNumContainBreak\tNumTotal\t1-(NumContainBreak/NumTotal)" << endl;
  for(int i = 0; i < blockCount; i++){
    cout << (i+1)*blockSize << '\t';
    cout << containBreak[i] << '\t';
    cout << totalWindows[i] << '\t';
    cout << (totalWindows[i] > 0? 1.0 - (double(containBreak[i])/double(totalWindows[i])): 0) << endl;
  }


  return(0);
} //end main()

Try running your program under valgrind. 尝试在valgrind下运行程序。 This will give you a report of possibly or actually lost memory, uninitialised, etc. 这将为您提供有关可能或实际丢失的内存,未初始化等的报告。

Your issues are probably due to due memory corruption occurring at some point in the program sometime prior to the actual errors you are seeing. 您的问题可能是由于您在看到实际错误之前的某个时候在程序中的某个位置发生了内存损坏。

One potential issue in the code you posted is the loop: 您发布的代码中的一个潜在问题是循环:

 for(int k = numBucketsThisWindow; k<numDiscontigBuckets && k < blockSize; k++){

which uses blockSize instead of the correct blockCount which leads to a possible overflow of both the totalWindows[] and containBreak[] arrays. 它使用blockSize而不是正确的blockCount ,这可能导致totalWindows[]containBreak[]数组均可能溢出。 This would overwrite the speciesMain and speciesOther strings, alonth with anything else on the stack, which might very well result in the errors you are seeing. 这将覆盖speciesMainspeciesOther字符串,以及堆栈上的其他任何内容,这很可能会导致您看到的错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM