简体   繁体   English

posix线程和O3优化

[英]posix threads and O3 optimisations

I am working on a program that uses mpi (openmpi 1.4.3) and pthreads, working in c++ under linux. 我正在使用mpi(openmpi 1.4.3)和pthreads的程序在Linux下的c ++中工作。

some of the mpi nodes have a queuing system implemented with pthreads. 一些mpi节点具有使用pthreads实现的排队系统。 Idea is simple one thread adding elements into queue, and few other "working" threads picking up objects and doing their job on them (not a rocket science). 想法很简单,一个线程将元素添加到队列中,而其他几个“工作”线程则拾取对象并在其上完成工作(不是火箭科学)。

Please consider 2 examples of my working threads which picking up elements. 请考虑我的工作线程中提取元素的2个示例。 First example working fine unless -O3 optimization is specified. 除非指定了-O3优化,否则第一个示例工作正常。 In that case it starts to endlessly looping without picking up anything. 在那种情况下,它开始无休止地循环而没有捡起任何东西。

    while (true){
        if (t_exitSignal[tID]){
            dorun = false;
            break;
        }

        //cout<<"w8\n";

        //check if queue has some work for us
        if (!frame_queue->empty()){

            //try to get lock and recheck that queue no empty
            pthread_mutex_lock( &mutex_frame_queue );

            if (!frame_queue->empty()){
                cout<<"Pickup "<<tID<<endl;
                con = frame_queue->front();
                frame_queue->pop();
                t_idling[tID] = false;
                pthread_mutex_unlock( &mutex_frame_queue );
                break;
            }

            pthread_mutex_unlock( &mutex_frame_queue );
        }

    }

Now consider this one, exactly the same code, except mutex gettimg locked before I checking for queue->empthy. 现在考虑这段代码,完全一样的代码,除了在检查queue-> empthy之前互斥锁gettimg已锁定。 This work works fine for all levels of optimization. 这项工作适用于所有级别的优化。

    while (true){
        if (t_exitSignal[tID]){
            dorun = false;
            break;
        }
        //cout<<"w8\n";

        //try to get lock and recheck that queue no empty
        pthread_mutex_lock( &mutex_frame_queue );

        //check if queue has some work for us
        if (!frame_queue->empty()){

                cout<<"Pickup "<<tID<<endl;
                con = frame_queue->front();
                frame_queue->pop();
                t_idling[tID] = false;
                pthread_mutex_unlock( &mutex_frame_queue );
                break;

        }
        pthread_mutex_unlock( &mutex_frame_queue );

    }

Just in case it make a difference this is how I populate queue from other thread 以防万一,这就是我从其他线程填充队列的方式

                    pthread_mutex_lock( &mutex_frame_queue );
            //adding the same contianer into queue to make it available for threads
            frame_queue->push(*cursor);
            pthread_mutex_unlock( &mutex_frame_queue );

My question is: why first example of code stop working why I compiling with -O3 option ? 我的问题是:为什么第一个代码示例停止工作,为什么使用-O3选项进行编译? Any other suggestion for the queuing system ? 关于排队系统还有其他建议吗?

Thanks a lot! 非常感谢!

SOLUTION: This is what I come up with at the end. 解决方案:这就是我最后提出的。 Seems to work much better than either of the methods above. 似乎比上述两种方法都好得多。 (just in case someone interested ;) (以防万一有人感兴趣;)

    while (true){

        if (t_exitSignal[tID]){

            dorun = false;
            break;
        }
        //try to get lock and check that queue no empty
        pthread_mutex_lock( &mutex_frame_queue );

        if (!frame_queue->empty()){

            con = frame_queue->front();
            frame_queue->pop();
            t_idling[tID] = false;
            pthread_mutex_unlock( &mutex_frame_queue );
            break;
        }else{

            pthread_cond_wait(&conf_frame_queue, &mutex_frame_queue);
            pthread_mutex_unlock( &mutex_frame_queue );
        }




    }

Adding 新增中

        pthread_mutex_lock( &mutex_frame_queue );

        //adding the same contianer into queue to make it available for threads
        frame_queue->push(*cursor);
        //wake up any waiting threads
        pthread_cond_signal(&conf_frame_queue);
        pthread_mutex_unlock( &mutex_frame_queue )

I'm tempted to suggest __sync_synchronize() before the first empty check, but that's probably not safe—if another thread's in the middle of adding to the container, that container may still be in an inconsistent state when you call empty() . 我倾向于在第一次空检查之前建议__sync_synchronize() ,但这可能并不安全-如果在添加到容器的过程中有另一个线程,当您调用empty()时,该容器可能仍处于不一致状态。 Depending on the container, anything could happen, from getting back the wrong answer to crashing. 根据容器的不同,从找回错误答案到崩溃都可能发生任何事情。

Josh is probably right too. 乔希也可能是对的。 Locking a mutex also provides a memory barrier, which among other things means your code will re-read the memory it's using to determine whether the container's empty each time. 锁定互斥锁还提供了内存屏障,这意味着您的代码将重新读取其正在使用的内存,以确定每次容器是否为空。 Without some sort of memory barrier, that's never actually guaranteed to happen, so at higher optimization levels your code may never see the change. 没有某种类型的内存屏障,实际上并不能保证会发生这种情况,因此在更高的优化级别上,您的代码可能永远看不到更改。

Also, have you looked into pthread's condition variables ? 另外,您是否研究过pthread的条件变量 They would allow you to avoid polling in a loop until your container's no longer empty. 它们将使您避免循环轮询直到容器不再为空。

我想您会在检查队列是否为空时看到基于指令排序假设的错误-当您启动优化时,顺序会更改,并且由于您没有设置互斥锁而无法保护内存,该顺序会中断这是发生的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM