简体   繁体   English

从两个不同的线程在两个不同的索引处写入浮点数组是否安全?

[英]Is writing to an array of floats from 2 different threads at two different indexes safe?

float myfloats[2];

// thread A:
myFloats[0] = 1.0;

// thread B:
myFloats[1] = 1.0;

Assuming that thread A will always access index 0, thread B index 1. Is this safe or can the array get corrupted? 假设线程A将始终访问索引0,线程B将访问索引1。这是安全的,还是数组可能会损坏?

The C11 n1570 draft standard appears to assert this is safe , but I assert that it is unwise . C11 n1570标准草案似乎断言这是安全的 ,但我断言这是不明智的


I base my argument on the fact that the elements of an array cannot overlap in memory, and on the following clauses of the C11 draft standard. 我的论据基于数组的元素不能在内存中重叠的事实,以及C11标准草案的以下条款。

5.1.2.4 Multi-threaded executions and data races 5.1.2.4 多线程执行和数据竞争

4. Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location. 4.如果两个表达式求值之一修改一个内存位置,而另一个表达式读取或修改了相同的内存位置,则这两个表达式求值会发生冲突

25. The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. 25.如果一个程序的执行在不同的线程中包含两个冲突的动作,则其中至少一个不是原子的,并且两个动作都不比另一个更早发生,则该程序的执行将引起数据争用 Any such data race results in undefined behavior. 任何此类数据争用都会导致未定义的行为。

27. NOTE 13 Compiler transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this standard, since such an assignment might overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race. 27.注13:本标准通常排除了将分配引入可能不会由抽象机修改的潜在共享内存位置的编译器转换,因为在抽象机的情况下,这样的分配可能会被另一个线程覆盖另一分配执行将不会遇到数据争用。 This includes implementations of data member assignment that overwrite adjacent members in separate memory locations. 这包括数据成员分配的实现,这些实现将覆盖单独的内存位置中的相邻成员。 We also generally preclude reordering of atomic loads in cases in which the atomics in question may alias, since this may violate the "visible sequence" rules. 通常,在所讨论的原子可能会混叠的情况下,我们通常也避免对原子负载进行重新排序,因为这可能违反“可见序列”规则。

28. NOTE 14 Transformations that introduce a speculative read of a potentially shared memory location may not preserve the semantics of the program as defined in this standard, since they potentially introduce a data race. 28.注14:引入潜在共享内存位置的推测读取的转换可能不会保留此标准中定义的程序的语义,因为它们潜在地引入了数据竞争。 However, they are typically valid in the context of an optimizing compiler that targets a specific machine with well-defined semantics for data races. 但是,它们通常在优化编译器的上下文中有效,该编译器针对具有明确定义的语义的特定机器针对数据竞争进行了优化。 They would be invalid for a hypothetical machine that is not tolerant of races or provides hardware race detection. 对于不容忍竞争或无法提供硬件竞争检测的假设机器,它们将无效。

We learn here that it is UB for two threads to perform conflicting actions on the same memory location, but that the compiler is "generally precluded" from introducing assignments to "potentially shared" memory locations that the abstract machine would not have performed. 我们在这里了解到,两个线程在相同的内存位置上执行冲突操作是UB,但是编译器“通常被禁止”将分配引入抽象机器不会执行的“可能共享的”内存位置。

You assert that your threads only access (read and write) the elements at their own specific index. 您断言线程只访问(读取和写入)位于其自己特定索引处的元素。 Since there is no question that they don't access the same memory location, it therefore appears to me that what you are doing is safe, provided you meet all other constraints, such as proper alignment of your float variables. 由于毫无疑问,它们不会访问相同的内存位置,因此在我看来,只要满足所有其他约束,例如正确对齐float变量,您的操作就很安全。


However, I query the wisdom of doing as you propose. 但是,我质疑按照您的建议做事的智慧 Since these two memory locations are contiguous, you will likely experience a severe false sharing problem. 由于这两个内存位置是连续的,因此您可能会遇到严重的错误共享问题。 This is because CPUs generally cache memory in "lines" of around 32 or 64 contiguous bytes, and communicate cache status using the MESI protocol. 这是因为CPU通常以大约32或64个连续字节的“行”形式缓存内存,并使用MESI协议传达缓存状态。

If a thread running on one core performs a write anywhere within this cacheline, then all copies of the cacheline and everything contained in it found in other cores are invalidated , usually leading to threads on those other cores needing to reread their updated copies from main memory. 如果在一个内核上运行的线程在此高速缓存行中的任何位置执行写入操作,则该高速缓存行的所有副本以及在其他内核中找到的包含在其中的所有内容都会失效 ,通常会导致那些其他内核上的线程需要从主内存中重新读取其更新的副本。 This is several times slower than accesses from cache. 这比从缓存访问要慢几倍。

True sharing occurs if the threads concerned were all accessing the same part of the cacheline, because this invalidation was then justified to prevent the communicating threads from using stale data. 如果所有相关线程都在访问高速缓存行的同一部分,则会发生真正的共享 ,因为这样做是有道理的,以防止通信线程使用陈旧数据。

On the other hand, false sharing occurs if the threads were all accessing different parts of the same cacheline. 另一方面,如果线程都访问同一高速缓存行的不同部分,则会发生错误共享 In this case, invalidation was not necessary, but the hardware did it anyways because of the proximity of the accesses to each other, penalizing all of them. 在这种情况下,无效是没有必要的,但是由于相互之间的访问距离很近,因此硬件无论如何都将其无效,这将对所有这些进行惩罚。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM