简体繁体 English

std::execution::unseq 启用了哪些优化？

[英]What optimizations does std::execution::unseq enable?

原文 2021-07-07 09:35:26 7 1 c++

In C++20, another execution policy: std::execution::unseq was added.在 C++20 中，添加了另一个执行策略： std::execution::unseq 。 It means, that an algorithm will be executed on current thread, but without guarantee that operations will be done in the order of the elements.这意味着，算法将在当前线程上执行，但不保证操作将按元素的顺序完成。

What is the rationale for adding that to the language, what optimizations does it enable?将其添加到语言中的基本原理是什么，它启用了哪些优化？ As far as I looked through implementation, this flag is pretty much treated exactly as seq , unless I missed something (what is likely).就我查看实现而言，这个标志几乎完全被视为seq ，除非我遗漏了一些东西（可能是什么）。 In my experience, compilers already recognize loops that they can vectorize (for trivial types) and vectorization is possible there without any special flags due to "as-if" rule.根据我的经验，编译器已经识别出可以向量化的循环（对于普通类型），并且由于“as-if”规则，可以在没有任何特殊标志的情况下进行向量化。 So, what does std::execution::unseq change?那么， std::execution::unseq改变了什么？

1 个解决方案

So, what does std::execution::unseq change?那么， std::execution::unseq 改变了什么？

Standard says (latest draft):标准说（最新草案）：

[execpol.unseq] [execpol.unseq]

The class unsequenced_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be vectorized , eg, executed on a single thread using instructions that operate on multiple data items.类 unsequenced_policy 是一种执行策略类型，用作消除并行算法重载歧义的唯一类型，并指示并行算法的执行可以矢量化，例如，使用对多个数据项进行操作的指令在单个线程上执行。

[algorithms.parallel.exec] [algorithms.parallel.exec]

The invocations of element access functions in parallel algorithms invoked with an execution policy object of type execution::unsequenced_policy are permitted to execute in an unordered fashion in the calling thread of execution, unsequenced with respect to one another in the calling thread of execution.使用类型 execution :: unsequenced_policy 类型的执行策略对象调用的并行算法中元素访问函数的调用被允许在调用执行线程中以无序方式执行，在调用执行线程中彼此无序.

[Note 4: This means that multiple function object invocations can be interleaved on a single thread of execution, which overrides the usual guarantee from [intro.execution] that function executions do not overlap with one another. [注 4：这意味着多个函数对象调用可以在单个执行线程上交错，这覆盖了 [intro.execution] 中函数执行不会相互重叠的通常保证。 — end note] — 尾注]

compilers already recognize loops that they can vectorize编译器已经识别出可以向量化的循环

Some compilers may sometimes be smart enough to detect that vectorisation is possible, and guess that it is beneficial.一些编译器有时可能足够聪明，可以检测到矢量化是可能的，并猜测它是有益的。 But it isn't simple and won't always work.但这并不简单，也不会总是奏效。 By telling the compiler that vectorisation is OK and desirable, there won't be guessing involved.通过告诉编译器矢量化是可以且可取的，就不会涉及猜测。