简体   繁体   English

在pthread_create中出现错误4的段错误

[英]Segfault with error 4 in pthread_create

I consistently receive following segfault with error 4 (which afaik is null pointer dereference by checking on https://rgeissert.blogspot.com/p/segmentation-fault-error.html ): 我始终收到以下错误错误为4的segfault(通过检查https://rgeissert.blogspot.com/p/segmentation-fault-error.html ,其afaik是空指针取消引用):

Aug  6 11:42:54 mypc kernel: [28532305.723536] myapp-new[14784]: segfault at 18 ip 00007f642c6c5d44 sp 00007ffc6a937700 error 4 in libpthread-2.23.so[7f642c6bc000+18000]

Code is following: 代码如下:

void* request(void* p){
    // ... code
}

void Daemon::run(){

    pthread_attr_t attr; 
    pthread_attr_init(&attr); 
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);

    LOG(INFO) << "daemon running...";

    int sockfd = serverSocket.getSocket();

    struct pollfd fds[1]; 
    fds[0].fd = sockfd; 
    fds[0].events = POLLIN | POLLPRI;

    pthread_t threads[1];

    while (!shutdown_daemon) {
        int currNrThreads = activeThreads;
        int i = 0;
        for (; activeThreads > nrThreads; i++) {
            usleep(4000);
        }
        if (i != 0) {
            LOG(INFO) << "too many threads, slept for " << i * 4 << "ms (" << currNrThreads
                      << " threads, now " << activeThreads << ")";
        }

        // wait until there is data to read on the listen socket
        //
        int retVal = poll(fds, 1, 2000);
        if (retVal == -1) {
            LOG(WARNING) << "error in poll: " << strerror(errno);
        }

        if (retVal <= 0) {
            continue;
        }

        // open a socket to communicate with and read the header
        //
        struct sockaddr_storage clientAddr;
        unsigned int clientLen = sizeof(clientAddr);
        int connectionID = accept(sockfd, (struct sockaddr*)&clientAddr, &clientLen);
        if (connectionID == -1) {
            LOG(WARNING) << "accept return -1, error: " << strerror(errno);
            continue;
        }

        // pack the payload into buffer
        auto buf = new size_t[2];
        buf[0] = (size_t)this;
        buf[1] = (size_t)connectionID;

        int ret = pthread_create(&threads[0], &attr, &request, (void*)buf);
        if (ret != 0) {
            LOG(ERROR) << "could not start pthread, ret: '" << ret << "'";
            delete[] buf;
            continue;
        }
        // Atomic increment
        ++activeThreads; 
    }

    LOG(INFO) << "term signal received, waiting child processess to finish";

    while (activeThreads > 0) {
        LOG(INFO) << "waiting for child processes to finish...";
        sleep(1); }

    LOG(INFO) << "all child processes are finished";
}

with following backtrace: 具有以下回溯:

(gdb) bt
#0  __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at pthread_create.c:713
#1  0x00000000004621fb in Daemon::run (this=0x7ffe1715d260) at src/daemon/daemon.cpp:992
#2  0x0000000000443842 in main (argc=4, argv=0x7ffe1715d6f8) at src/myapp.cpp:991
(gdb) bt full
#0  __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at pthread_create.c:713
        stackaddr = <optimized out>
        iattr = <optimized out>
        default_attr = {schedparam = {__sched_priority = 0}, schedpolicy = 0, flags = -1456592096, guardsize = 16,
          stackaddr = 0x409d50 <_GLOBAL__sub_I__ZN4YAML5RegExC2Ev+16>, stacksize = 140729285727984, cpuset = 0x0, cpusetsize = 0}
        free_cpuset = <optimized out>
        pd = 0x7fb373fff700
        retval = <optimized out>
        self = <optimized out>
        thread_ran = true
        __PRETTY_FUNCTION__ = "__pthread_create_2_1"
#1  0x00000000004621fb in Daemon::run (this=0x7ffe1715d260) at src/daemon/daemon.cpp:992
        currNrThreads = -1447528544
        i = 8
        retVal = 0
        clientAddr = {ss_family = 2,
          __ss_padding = "\336L\177\000\000\001", '\000' <repeats 24 times>, "\260\315\025\027\376\177\000\000\006\000\000\000\000\000\000\000daemon\000\000\b\317\025\027\376\177\000\000\320\315\025\027\376\177\000\000\n\000\000\000\000\000\000\000BackupLock\000\027\376\177\000\000\000\316\025\027\376\177\000\000!\303@\000\000\000\000\000\274\204O\000\000\000\000", __ss_align = 140729285725960}
        buf = 0x7fb300000000
        clientLen = 32691
        connectionID = 3
        attr = {__size = "\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\000\020", '\000' <repeats 37 times>, __align = 0}
        __PRETTY_FUNCTION__ = "void Daemon::run()"
        sockfd = 16
        fds = {{fd = 3, events = 3, revents = 1}}
        threads = {140408722028288}

How is it possible? 这怎么可能? I would understand some out of memory error, but where can error 4 happen in this code ( Ubuntu 16.04, glibc 2.2.3 ) 我会理解一些内存不足的错误,但是此代码中哪里可能发生错误4( Ubuntu 16.04, glibc 2.2.3

We also suspected hardware issues so replaced all RAM on this machine but issue persisted. 我们还怀疑硬件问题,因此更换了该计算机上的所有RAM,但问题仍然存在。

Update on this: after moving software to another machine (with glibc 2.26) crashes completely stopped (under same load). 对此进行更新:将软件移至另一台计算机(使用glibc 2.26)后,崩溃完全停止了(在相同的负载下)。

First of all, 首先,

I would try to replace the following: 我将尝试替换以下内容:

// pack the payload into buffer
auto buf = new size_t[2];
buf[0] = (size_t)this;
buf[1] = (size_t)connectionID;

With a nice struct, like: 具有良好的结构,例如:

struct thread_param
{
  Daemon* deamon;
  int connectionID;
}

maybe I'm just paranoid, but I wouldn't use auto declaration in such as case where I'll be needed to cast it later... 也许我只是偏执狂,但在以后需要投射它的情况下,我不会使用自动声明。

Second 第二

it can be so many reasons to crush, and you didn't gave the whole information. 压榨的原因可能很多,而您没有提供全部信息。

Does your program crash at: 您的程序是否在以下位置崩溃:

  1. The first iteration? 第一次迭代?
  2. The second iteration? 第二次迭代?
  3. The 1000th iterfation? 第1000次迭代?

For each case it may caused by other reason: 对于每种情况,可能是由其他原因引起的:

  1. You are using an uninitialized pointer. 您正在使用未初始化的指针。
  2. You are using a deleted pointer. 您正在使用已删除的指针。
  3. You have a memory leak and it cause some memory allocation to failed. 您有内存泄漏,它会导致某些内存分配失败。 For example I odn't see you're deleting buf in case the pthread_create has succeeded (I guess it should be in the request function, and you didn't check if allocation had been succeded...) 例如,在pthread_create成功的情况下,我看不到要删除buf(我想应该在request函数中,并且您没有检查分配是否成功...)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM