简体   繁体   English

c ++ openmp with shared_pointer

[英]c++ openmp with shared_pointer

here is a minimal example of what bothers me 这是困扰我的最小例子

#include <iostream>
#include <memory>
#include"omp.h"

class A{
    public:
        A(){std::cout<<this<<std::endl;}
};

int main(){
#pragma omp parallel for 
    for(unsigned int i=0;i<4;i++){
        std::shared_ptr<A> sim(std::make_shared<A>());
    }
    for(unsigned int i=0;i<4;i++){
        std::shared_ptr<A> sim(std::make_shared<A>());
    }
}

If I run that code a few times I may get this kind of result : 如果我运行该代码几次,我可能得到这样的结果:

0xea3308
0xea32d8
0xea3338
0x7f39f80008c8
0xea3338
0xea3338
0xea3338
0xea3338

What I realized is that the 4 last output have always the same number of characters (8). 我意识到最后4个输出总是具有相同的字符数(8)。 But for some reason it happens (not always) that one or more of the four first output contains more (14) characters. 但由于某种原因,它发生(并非总是)四个第一个输出中的一个或多个包含更多(14)个字符。 It looks like the use of openmp changes the "nature" of the pointer (this is my naive understanding). 看起来使用openmp改变了指针的“本质”(这是我天真的理解)。 But is this behaviour normal ? 但这种行为是正常的吗? Should I expect some strange behaviour ? 我应该期待一些奇怪的行为吗?

EDIT 编辑

here is a live test that shows the same problem in a slightly more complicated version of the code 是一个实时测试,在稍微复杂的代码版本中显示相同的问题

This behaviour is entirely reasonable, let's see what's happening. 这种行为是完全合理的,让我们看看发生了什么。

Serial loop 串行循环

In every iteration you're getting one A that's being created on the heap, and one is getting destroyed. 在每次迭代中,你都会得到一个在堆上创建的A ,一个正在被破坏。 These operations are ordered like so: 这些操作的顺序如下:

  1. construction 施工
  2. destruction 毁坏
  3. construction 施工
  4. destruction 毁坏
  5. ... (and so on) ... (等等)

Since the A s are being created on the heap, they go through the memory allocator. 由于A是在堆上创建的,因此它们通过内存分配器。 When the memory allocator gets a request for new memory as in step 3, it will (in many cases) first look at recently freed memory. 当内存分配器获得新内存请求时,如步骤3,它将(在许多情况下)首先查看最近释放的内存。 It sees that the last operation was a memory free of exactly the right size (step 2), and therefore will take that chunk of memory again. 它看到最后一个操作是一个没有完全正确大小的内存(步骤2),因此将再次占用该块内存。 This procedure will repeat in each iteration. 此过程将在每次迭代中重复。 So the serial loop will (commonly but not necessarily) give you the same address over and over again. 因此,串行循环将(通常但不一定)一遍又一遍地为您提供相同的地址。

Parallel loop 并行循环

Now let's think about the parallel loop. 现在让我们考虑并行循环。 Since there is no synchronization the ordering of the memory allocations and deallocations is not determined. 由于没有同步,因此不确定存储器分配和解除分配的顺序。 Therefore it is possible for them to be interleaved in whatever way you can imagine. 因此,它们可以以您能想象的任何方式交错。 So the memory allocator will in general not be able to use the same trick as last time to always hand out the same piece of memory. 因此,内存分配器通常不能使用与上次相同的技巧来始终分配同一块内存。 An example ordering may be for example that all four A s get constructed before they all get destroyed - something like this: 一个示例排序可能是例如所有四个A都在它们全部被破坏之前构建 - 如下所示:

  1. construction 施工
  2. construction 施工
  3. construction 施工
  4. construction 施工
  5. destruction 毁坏
  6. destruction 毁坏
  7. destruction 毁坏
  8. destruction 毁坏

The memory allocator will therefore have to serve up 4 brand new pieces of memory before it can get some back and start recycling. 因此,内存分配器必须提供 4个全新的内存,然后才能回收并开始回收。

The behaviour of the stack based version is slightly more deterministic, but can depend on compiler optimizations. 基于堆栈的版本的行为稍微更具确定性,但可能依赖于编译器优化。 For the serial version every time the object is created/destroyed the stack pointer is adjusted. 对于串行版本,每次创建/销毁对象时,都会调整堆栈指针。 Since there is nothing happening in between, it will keep getting created in the same location. 由于两者之间没有任何事情发生,因此它将继续在同一位置创建。

For the parallel version, every thread has it's own stack in a shared memory system. 对于并行版本,每个线程在共享内存系统中都有自己的堆栈。 Therefore each thread will create it's objects in a different memory location, and no "recycling" is possible. 因此,每个线程将在不同的内存位置创建它的对象,并且不可能“回收”。

The behaviour you're seeing is in no way strange or for that matter guaranteed. 你所看到的行为绝不是奇怪的,或者保证这一点。 It depends on the amount of physical cores you have, how many threads get run, how many iterations you use - generally runtime conditions. 它取决于您拥有的物理内核数量,运行的线程数,您使用的迭代次数 - 通常是运行时条件。

Bottom line : everything is fine, you shouldn't read too much into it. 一句话 :一切都很好,你不应该读太多。

I think this depends to your environment it is not revelant and should not be considered as strange behaviour. 我认为这取决于你的环境,它不是骄傲,不应该被视为奇怪的行为。 Using MS VS 2015 preview, your code gives me the following (with OMP enabled) : 使用MS VS 2015预览版,您的代码为我提供了以下内容(启用了OMP):

0082C3DC
0082C41C   
0082C49C                                       
0082C45C                                       
0082C41C                                       
0082C41C                                       
0082C41C                   
0082C41C 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM