在多线程共享过程中，原子操作似乎比信号量操作慢

Question

Can you think on any good reason why atomic operations seems slower than semaphores, even though there is a decrease on instructions? 您能以任何充分的理由认为，即使指令数量有所减少，但原子操作似乎比信号量慢吗？

Sample code: 样例代码：

 void increment(){
     if (strcmp(type, "ATOMIC") == 0) {
         for (int i = 0; i < RUN_TIME; ++i) {
            atomic_fetch_add_explicit(&count, 1, memory_order_relaxed);
        }
    }
     if (strcmp(type, "SEMAPHORE") == 0){
        for (int i = 0; i < RUN_TIME; ++i) {
            sem_wait(sem);
            count++;
            sem_post(sem);
        }
    }
}

Output: 输出：

   time ./CMAIN "SEMAPHORE";time ./CMAIN "ATOMIC";
 [C] SEMAPHORE, count 4000000

 real    0m0.039s
 user    0m0.029s
 sys     0m0.002s
[C] ATOMIC, count 4000000

 real    0m0.092s
 user    0m0.236s
 sys     0m0.003s

Answer 1

It shouldn't because what I read is that "in semaphore When some process is trying to access semaphore which is not available, semaphore puts process on wait queue(FIFO) and puts task on sleep , it's more time consuming or more overheads for CPU rather than Atomic operations. 这不应该是因为我读到的是“在信号量中”当某些进程试图访问不可用的信号量时 ， 信号量将进程置于等待队列（FIFO）上并将任务置于睡眠状态 ，这会浪费更多时间或CPU开销更大而不是原子操作。

normally atomic operation will perform faster because it will load, update & modify instruction all together. 通常，原子操作会更快地执行，因为它将一起加载，更新和修改指令。 But Atomic operation are CPU specific ie n++ will executed in single instruction (INC) or not, always can't guarantee. 但是原子操作是特定于CPU的，即n ++是否会在单指令（INC）中执行，始终不能保证。 So it's upto CPU to decide, May be because of this reason you are getting output like this. 因此，由CPU决定，可能是因为这个原因，您正在获得这样的输出。

What I understood I wrote, suggestion will be appreciated. 我了解到我写的内容，建议将不胜感激。

Answer 2

Can't reproduce. 无法复制。 For 10^9 iterations, I'm getting (from bash, i5, x86_64, Linux): 对于10 ^ 9迭代，我得到了（从bash，i5，x86_64，Linux）：

$ TIMEFORMAT="%RR %UU %SS"
$ gcc atomic.c -Os -lpthread && ( time ./a.out ATOMIC  ; time ./a.out  SEMAPHORE )
1.572R  1.568U  0.000S  #ATOMIC
5.542R  5.536U  0.000S  #SEMAPHORE

(About the same ratio for 4000000 iterations.) （大约相同的比率进行4000000次迭代。）

My atomic.c (your example with the blanks filled in): 我的atomic.c（您的示例中填入了空白）：

#include <stdio.h>
#include <string.h>
#include <stdatomic.h>
#include <semaphore.h>
#define RUN_TIME 100000000
char * type;
sem_t *sem;

_Atomic int count = ATOMIC_VAR_INIT(0);

 void increment(){
     if (strcmp(type, "ATOMIC") == 0) {
         for (int i = 0; i < RUN_TIME; ++i) {
            atomic_fetch_add_explicit(&count, 1, memory_order_relaxed);
        }
    }
     if (strcmp(type, "SEMAPHORE") == 0){
        for (int i = 0; i < RUN_TIME; ++i) {
            sem_wait(sem);
            count++;
            sem_post(sem);
        }
    }
}

int main(int C, char**V)
{
    sem_t s;
    sem_init(&s, 0, 1);
    sem = &s;
    type = V[1];
    increment();
}

Please post an mcve , along with your platform specs. 请发布mcve以及您的平台规格。

在多线程共享过程中，原子操作似乎比信号量操作慢

问题描述

2 个解决方案

解决方案1
0 2017-11-10 15:14:03

解决方案2
0 2017-11-10 17:01:03

在多线程共享过程中，原子操作似乎比信号量操作慢

问题描述

2 个解决方案

解决方案1 0 2017-11-10 15:14:03

解决方案2 0 2017-11-10 17:01:03

解决方案1
0 2017-11-10 15:14:03

解决方案2
0 2017-11-10 17:01:03