简体   繁体   English

为什么gcc没有优化全局变量?

[英]Why gcc isn't optimizing the global variable?

I am trying to understand the behavior of volatile and compiler optimization in C through an example. 我试图通过一个例子来理解C中volatile和编译器优化的行为。

For this, I referred: 为此,我提到:

Where to use volatile? 哪里用volatile?

Why is volatile needed in C? 为什么C需要挥发性?

https://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming https://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming

All of the above posts have at least one answer related to signal handler so for this, I have written a simple code to actually implement and observe the behavior in Linux just for understanding. 以上所有帖子都至少有一个与信号处理程序相关的答案,为此,我编写了一个简单的代码来实际实现并观察Linux中的行为只是为了理解。

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <pthread.h>

int counter = 0;

void *thread0_func(void *arg)
{
    printf("Thread 0\n");
    while(1)
    {

    }
    return NULL;
}

void *thread1_func(void *arg)
{
    printf("Thread 1\n");
    while(counter == 0)
    {
        printf("Counter: %d\n", counter);
        usleep(90000);
    }
    return NULL;
}

void action_handler(int sig_no)
{
    printf("SigINT Generated: %d\n",counter);
    counter += 1;
}

int main(int argc, char **argv)
{
    pthread_t thread_id[2];

    struct sigaction sa;

    sa.sa_handler = action_handler;

    if(sigaction(SIGINT, &sa, NULL))
        perror("Cannot Install Sig handler");


    if(pthread_create(&thread_id[0], NULL, thread0_func, NULL))
    {
        perror("Error Creating Thread 0");
    }
    if(pthread_create(&thread_id[1], NULL, thread1_func, NULL))
    {
        perror("Error Creating Thread 0");
    }
    else
    {

    }
    while(1)
    {
        if(counter >= 5)
        {
            printf("Value of Counter is more than five\n");
        }
        usleep(90000);
    }
    return (0);
}

This code is just for learning and understanding. 此代码仅供学习和理解。

I tried compiling the code using: 我尝试使用以下代码编译代码:
gcc -O3 main.c -o main -pthread

But the compiler is not acting on global variable counter and is not optimizing it. 但是编译器没有对全局变量counter ,也没有对其进行优化。
I was expecting *thread1_func to execute in a forever loop and the if (counter >= 5) to be never true. 我期待*thread1_func在永久循环中执行而if (counter >= 5)永远不会为真。

What am I missing here? 我在这里错过了什么?

GCC Version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4) GCC版本: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

Your if tests on the value of counter are interspersed with calls to usleep and printf . 你对counter值的if测试中穿插了对usleepprintf调用。 These are opaque library calls. 这些是不透明的库调用。 The compiler cannot see through them and so it has to assume they may have access to the counter external variable, and so it has to reload the counter variable after those calls. 编译器无法透视它们,因此必须假设它们可以访问counter外部变量,因此必须在这些调用之后重新加载counter变量。

If you move these calls out, the code gets optimized as you expect: 如果您将这些调用移出,代码将按预期进行优化:

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <pthread.h>

int counter = 0;

void *thread0_func(void *arg)
{
    printf("Thread 0\n");
    while(1)
    {

    }
    return NULL;
}

void *thread1_func(void *arg)
{
    printf("Thread 1\n");
    unsigned i=0;
    while(counter == 0)
    {
       i++;
    }
    printf("Thread 1: %d, i=%u\n", counter, i);
    return NULL;
}

void action_handler(int sig_no)
{
    printf("SigINT Generated: %d\n",counter);
    counter += 1;
}

int main(int argc, char **argv)
{
    pthread_t thread_id[2];

    struct sigaction sa;

    sa.sa_handler = action_handler;

    if(sigaction(SIGINT, &sa, NULL))
        perror("Cannot Install Sig handler");


    if(pthread_create(&thread_id[0], NULL, thread0_func, NULL))
    {
        perror("Error Creating Thread 0");
    }
    if(pthread_create(&thread_id[1], NULL, thread1_func, NULL))
    {
        perror("Error Creating Thread 0");
    }
    else
    {

    }
    while(1)
    {
        if(counter >= 5)
        {
            printf("Value of Counter is more than five\n");
        }
        usleep(90000);
    }
    return (0);
}

Even if you make the counter variable static , the compiler will still not optimize, because although an external library definitely won't see the counter variable, the external call may theoretically have a mutex lock, which would allow another thread to change the variable without a data race. 即使你使计数器变量为static ,编译器仍然不会优化,因为虽然外部库肯定不会看到计数器变量,但外部调用理论上可能有一个互斥锁,这将允许另一个线程更改变量而不数据竞赛。 Now neither usleep nor printf are wrappers around a mutex lock, but the compiler doesn't know, nor does it do inter-thread optimization, so it has to be conservative and reload the counter variable after the call and the reload is what prevents the optimization you expect. 现在,无论是usleep还是printf都没有围绕互斥锁的包装,但编译器不知道,也不进行线程间优化,所以它必须保守并在调用后重新加载计数器变量,重载是阻止您期望的优化。

Of course, a simple explanation would be that your program is undefined if the signal handler executes, because you should've made counter volatile sig_atomic_t and you should've have synced your inter-thread access to it with either _Atomic or a mutex -- and in an undefined program, anything is possible. 当然,一个简单的解释是,你的程序是不确定的,如果信号处理程序执行,因为你应该所做的counter volatile sig_atomic_t ,你应该已经有同步您的线程间访问它与任何_Atomic或互斥-在一个未定义的程序中,一切皆有可能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM