简体   繁体   English

std::execution::unseq 启用了哪些优化?

[英]What optimizations does std::execution::unseq enable?

In C++20, another execution policy: std::execution::unseq was added.在 C++20 中,添加了另一个执行策略: std::execution::unseq It means, that an algorithm will be executed on current thread, but without guarantee that operations will be done in the order of the elements.这意味着,算法将在当前线程上执行,但不保证操作将按元素的顺序完成。

What is the rationale for adding that to the language, what optimizations does it enable?将其添加到语言中的基本原理是什么,它启用了哪些优化? As far as I looked through implementation, this flag is pretty much treated exactly as seq , unless I missed something (what is likely).就我查看实现而言,这个标志几乎完全被视为seq ,除非我遗漏了一些东西(可能是什么)。 In my experience, compilers already recognize loops that they can vectorize (for trivial types) and vectorization is possible there without any special flags due to "as-if" rule.根据我的经验,编译器已经识别出可以向量化的循环(对于普通类型),并且由于“as-if”规则,可以在没有任何特殊标志的情况下进行向量化。 So, what does std::execution::unseq change?那么, std::execution::unseq改变了什么?

So, what does std::execution::unseq change?那么, std::execution::unseq 改变了什么?

Standard says (latest draft):标准说(最新草案):

[execpol.unseq] [execpol.unseq]

The class unsequenced_policy is an execution policy type used as a unique type to disambiguate parallel algorithm overloading and indicate that a parallel algorithm's execution may be vectorized , eg, executed on a single thread using instructions that operate on multiple data items.类 unsequenced_policy 是一种执行策略类型,用作消除并行算法重载歧义的唯一类型,并指示并行算法的执行可以矢量化,例如,使用对多个数据项进行操作的指令在单个线程上执行。


[algorithms.parallel.exec] [algorithms.parallel.exec]

The invocations of element access functions in parallel algorithms invoked with an execution policy object of type execution​::​unsequenced_policy are permitted to execute in an unordered fashion in the calling thread of execution, unsequenced with respect to one another in the calling thread of execution.使用类型 execution :: unsequenced_policy 类型的执行策略对象调用的并行算法中元素访问函数的调用被允许在调用执行线程中以无序方式执行,在调用执行线程中彼此无序.

[Note 4: This means that multiple function object invocations can be interleaved on a single thread of execution, which overrides the usual guarantee from [intro.execution] that function executions do not overlap with one another. [注 4:这意味着多个函数对象调用可以在单个执行线程上交错,这覆盖了 [intro.execution] 中函数执行不会相互重叠的通常保证。 — end note] — 尾注]


compilers already recognize loops that they can vectorize编译器已经识别出可以向量化的循环

Some compilers may sometimes be smart enough to detect that vectorisation is possible, and guess that it is beneficial.一些编译器有时可能足够聪明,可以检测到矢量化是可能的,并猜测它是有益的。 But it isn't simple and won't always work.但这并不简单,也不会总是奏效。 By telling the compiler that vectorisation is OK and desirable, there won't be guessing involved.通过告诉编译器矢量化是可以且可取的,就不会涉及猜测。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 std::execution::par_unseq 如何处理向量上的 remove_if? - How does std::execution::par_unseq handle remove_if on a vector? 带有 std::execution::par_unseq 的 std::for_each 在 GCC 上不工作,但在 MSVC 上工作 - std::for_each with std::execution::par_unseq not working on GCC but working with MSVC 如何使用 std::execution::par_unseq 使线程安全? - How to make threads with std::execution::par_unseq thread-safe? 带有选项 par_unseq 的 transform() 有什么作用? - what does transform() with option par_unseq do? 为什么在测量 OpenMP 并行化 for 循环和编译器优化的执行时间时 std::chrono 不起作用? - Why std::chrono does not work when measuring the execution time of OpenMP parallelized for loops and compiler optimizations? __builtin_unreachable 促进了哪些优化? - What optimizations does __builtin_unreachable facilitate? rvalue保证什么样的优化? - What kind of optimizations does rvalue guarantee? std :: move和RVO优化 - std::move and RVO optimizations std::enable_if 和 std::enable_if_t 有什么区别? - What is the difference between std::enable_if and std::enable_if_t? 什么样的优化在C ++中“易失”? - What kinds of optimizations does 'volatile' prevent in C++?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM