简体   繁体   English

在Visual Studio中,与std :: async一起使用时未调用`thread_local`变量'析构函数,这是一个错误吗?

[英]In Visual Studio, `thread_local` variables' destructor not called when used with std::async, is this a bug?

The following code 以下代码

#include <iostream>
#include <future>
#include <thread>
#include <mutex>

std::mutex m;

struct Foo {
    Foo() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo Created in thread " <<std::this_thread::get_id() <<"\n";
    }

    ~Foo() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo Deleted in thread " <<std::this_thread::get_id() <<"\n";
    }

    void proveMyExistance() {
        std::unique_lock<std::mutex> lock{m};
        std::cout <<"Foo this = " << this <<"\n";
    }
};

int threadFunc() {
    static thread_local Foo some_thread_var;

    // Prove the variable initialized
    some_thread_var.proveMyExistance();

    // The thread runs for some time
    std::this_thread::sleep_for(std::chrono::milliseconds{100}); 

    return 1;
}

int main() {
    auto a1 = std::async(std::launch::async, threadFunc);
    auto a2 = std::async(std::launch::async, threadFunc);
    auto a3 = std::async(std::launch::async, threadFunc);

    a1.wait();
    a2.wait();
    a3.wait();

    std::this_thread::sleep_for(std::chrono::milliseconds{1000});        

    return 0;
}

Compiled and run width clang in macOS: 在macOS中编译并运行宽度clang:

clang++ test.cpp -std=c++14 -pthread
./a.out

Got result 得到了结果

 Foo Created in thread 0x70000d9f2000 Foo Created in thread 0x70000daf8000 Foo Created in thread 0x70000da75000 Foo this = 0x7fd871d00000 Foo this = 0x7fd871c02af0 Foo this = 0x7fd871e00000 Foo Deleted in thread 0x70000daf8000 Foo Deleted in thread 0x70000da75000 Foo Deleted in thread 0x70000d9f2000 

Compiled and run in Visual Studio 2015 Update 3: 在Visual Studio 2015 Update 3中编译并运行:

 Foo Created in thread 7180 Foo this = 00000223B3344120 Foo Created in thread 8712 Foo this = 00000223B3346750 Foo Created in thread 11220 Foo this = 00000223B3347E60 

Destructor are not called. 析构函数不会被调用。

Is this a bug or some undefined grey zone? 这是一个错误还是一些未定义的灰色区域?

PS PS

If the sleep std::this_thread::sleep_for(std::chrono::milliseconds{1000}); 如果睡眠std::this_thread::sleep_for(std::chrono::milliseconds{1000}); at the end is not long enough, you may not see all 3 "Delete" messages sometimes. 最后时间不够长,有时你可能看不到所有3个“删除”消息。

When using std::thread instead of std::async , the destructors get called on both platform, and all 3 "Delete" messages will always be printed. 当使用std::thread而不是std::async ,将在两个平台上调用析构函数,并且将始终打印所有3个“删除”消息。

Introductory Note: I have now learned a lot more about this and have therefore re-written my answer. 介绍性说明:我现在已经对此有了更多了解,因此重新编写了我的答案。 Thanks to @super, @MM and (latterly) @DavidHaim and @NoSenseEtAl for putting me on the right track. 感谢@super,@ MM和(后来)@DavidHaim和@NoSenseEtAl让我走上正轨。

tl;dr Microsoft's implementation of std::async is non-conformant, but they have their reasons and what they have done can actually be useful, once you understand it properly. 文艺青年最爱的微软实施std::async是不符合的,但他们有他们的理由和他们所做的实际上是有用的,一旦你正确地理解它。

For those who don't want that, it is not too difficult to code up a drop-in replacement replacement for std::async which works the same way on all platforms. 对于那些不想要它的人来说,为std::async编写一个替换替代品并不太难,它在所有平台上都以相同的方式工作。 I have posted one here . 我在这里发了一个。

Edit: Wow, how open MS are being these days, I like it, see: https://github.com/MicrosoftDocs/cpp-docs/issues/308 编辑:哇,这些天MS有多开放 ,我喜欢它,请参阅: https//github.com/MicrosoftDocs/cpp-docs/issues/308


Let's being at the beginning. 让我们开始吧。 cppreference has this to say (emphasis and strikethrough mine): cppreference有这个说法(强调和删除我的):

The template function async runs the function f asynchronously ( potentially optionally in a separate thread which may be part of a thread pool ). 模板函数async运行函数f 可能 可选地在可能是线程池的一部分的单独线程 )。

However, the C++ standard says this: 但是, C ++标准说:

If launch::async is set in policy , [ std::async ] calls [the function f] as if in a new thread of execution ... 如果在policy设置了launch::async ,[ std::async ]会调用[函数f] ,就像在新的执行线程中一样 ...

So which is correct? 哪个是正确的? The two statements have very different semantics as the OP has discovered. OP发现,这两个语句具有非常不同的语义。 Well of course the standard is correct, as both clang and gcc show, so why does the Windows implementation differ? 当然,标准是正确的,因为clang和gcc都显示,为什么Windows实现有所不同? And like so many things, it comes down to history. 就像很多事情一样,它归结为历史。

The (oldish) link that MM dredged up has this to say, amongst other things: MM挖掘的(旧的) 链接可以说,除其他外:

... Microsoft has its implementation of [ std::async ] in the form of PPL (Parallel Pattern Library) ... [and] I can understand the eagerness of those companies to bend the rules and make these libraries accessible through std::async , especially if they can dramatically improve performance... ...微软以PPL (并行模式库)的形式实现了[ std::async ] ...... [和]我可以理解这些公司急于改变规则并通过std::async访问这些库的std::async ,特别是如果它们可以显着提高性能......

... Microsoft wanted to change the semantics of std::async when called with launch_policy::async. ...当使用launch_policy::async.调用时,Microsoft希望更改std::async的语义launch_policy::async. I think this was pretty much ruled out in the ensuing discussion ... (rationale follows, if you want to know more then read the link, it's well worth it). 我认为在随后的讨论中几乎排除了这一点......(理由如下,如果你想了解更多,那么阅读链接,这是值得的)。

And PPL is based on Windows' built-in support for ThreadPools , so @super was right. PPL基于Windows对ThreadPools的内置支持,所以@super是对的。

So what does the Windows thread pool do and what is it good for? 那么Windows线程池做了什么以及它有什么用呢? Well, it's intended to manage frequently-sheduled, short-running tasks in an efficient way so point 1 is don't abuse it , but my simple tests show that if this is your use-case then it can offer significant efficiencies. 好吧,它旨在以有效的方式管理经常运行的,短期运行的任务,所以第1点不要滥用它 ,但我的简单测试表明,如果这是你的用例,那么它可以提供显着的效率。 It does, essentially, two things 它本质上是两件事

  • It recycles threads, rather than having to always start a new one for each asynchronous task you launch. 它会回收线程,而不必总是为您启动的每个异步任务启动一个新线程。
  • It limits the total number of background threads it uses, after which a call to std::async will block until a thread becomes free. 它限制了它使用的后台线程总数,之后对std::async的调用将阻塞,直到线程变为空闲。 On my machine, this number is 768. 在我的机器上,这个数字是768。

So knowing all that, we can now explain the OP's observations: 所以知道这一切,我们现在可以解释OP的观察结果:

  1. A new thread is created for each of the three tasks started by main() (because none of them terminates immediately). main()启动的三个任务中的每个任务创建一个新线程(因为它们都不会立即终止)。

  2. Each of these three threads creates a new thread-local variable Foo some_thread_var . 这三个线程中的每一个都创建一个新的线程局部变量Foo some_thread_var

  3. These three tasks all run to completion but the threads they are running on remain in existence (sleeping). 这三个任务都运行完成,但它们运行的线程仍然存在(休眠)。

  4. The program then sleeps for a short while and then exits, leaving the 3 thread-local variables un-destructed. 程序然后休眠一会儿然后退出,留下3个线程局部变量未被破坏。

I ran a number of tests and in addition to this I found a few key things: 我运行了一些测试,除此之外我发现了一些关键的东西:

  • When a thread is recycled, the thread-local variables are re-used. 当线程被回收时,线程局部变量被重用。 Specifically, they are not destroyed and then re-created (you have been warned!). 具体来说,它们不会被销毁然后重新创建(您已被警告过!)。
  • If all the asynchonous tasks complete and you wait long enough, the thread pool terminates all the associated threads and the thread-local variables are then destroyed. 如果所有的asynchonous任务完成,并等待足够长的时间,线程池终止所有相关的线程,然后线程局部变量销毁。 (No doubt the actual rules are more complex than that but that's what I observed). (毫无疑问,实际的规则比那更复杂,但这就是我所观察到的)。
  • As new asynchonous tasks are submitted, the thread pool limits the rate at which new threads are created, in the hope that one will become free before it needs to perform all that work (creating new threads is expensive). 当提交新的异步任务时,线程池限制了创建新线程的速率 ,希望在它需要执行所有工作之前它将变为空闲(创建新线程是昂贵的)。 A call to std::async might therefore take a while to return (up to 300ms in my tests). 因此,调用std::async可能需要一段时间才能返回(在我的测试中最多300毫秒)。 In the meantime, it's just hanging around, hoping that its ship will come in. This behaviour is documented but I call it out here in case it takes you by surprise. 与此同时,它只是徘徊,希望它的船将进来。这种行为有记录,但我在这里称呼它,以防它让你感到惊讶。

Conclusions: 结论:

  1. Microsoft's implementation of std::async is non-conformant but it is clearly designed with a specific purpose, and that purpose is to make good use of the Win32 ThreadPool API. 微软的std::async是不符合要求的,但它的设计明确是出于特定目的,其目的是充分利用Win32 ThreadPool API。 You can beat them up for blantantly flouting the standard but it's been this way for a long time and they probably have (important!) customers who rely on it. 你可以肆无忌惮地蔑视标准,但是这种方式已经很长时间了,他们可能有(重要的)客户依赖它。 I will ask them to call this out in their documentation. 我会请他们在他们的文件中说出来。 Not doing that is criminal. 如果不这样做犯罪。

  2. It is not safe to use thread_local variables in std::async tasks on Windows. 它是不是安全使用thread_local变量std::async在Windows任务。 Just don't do it, it will end in tears. 只是不要这样做,它会以泪水结束。

Looks like just another of many bugs in VC++. 看起来只是VC ++中的许多错误中的另一个。 Consider this quote from n4750 请考虑n4750的这句话

All variables declared with the thread_local keyword have thread storage duration . 使用thread_local关键字声明的所有变量都具有线程存储持续时间。 The storage for these entities shall last for the duration of the thread in which they are created. 这些实体的存储应持续创建它们的线程的持续时间。 There is a distinct object or reference per thread, and use of the declared name refers to the entity associated with the current thread. 每个线程有一个不同的对象或引用,声明的名称的使用是指与当前线程关联的实体。 2 A variable with thread storage duration shall be initialized before its first odr-use (6.2) and, if constructed, shall be destroyed on thread exit. 2具有线程存储持续时间的变量应在其第一次使用(6.2)之前初始化,如果构造,应在线程退出时销毁。

+this 这+

If the implementation chooses the launch::async policy, — (5.3) a call to a waiting function on an asynchronous return object that shares the shared state created by this async call shall block until the associated thread has completed, as if joined, or else time out (33.3.2.5); 如果实现选择了launch :: async策略, - (5.3)对共享由此异步调用创建的共享状态的异步返回对象的等待函数的调用将阻塞,直到关联的线程完成,就好像已连接,或者别的时间超时(33.3.2.5);

I could be wrong("thread exit" vs "thread completed", but I feel this means that thread_local variables need to be destroyed before .wait() call unblocks. 我可能是错的(“线程退出”vs“线程已完成”,但我觉得这意味着在.wait()调用unblocks之前需要销毁thread_local变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM