简体   繁体   中英

Does C++ guarantees it is safe to access adjacent elements of an array from two threads

As far as the C++ standard is concerned (C++11 and later, I guess, since before threads were not considered), is it safe to write concurrently to different , possibly adjacent, elements of an array?

For example:

#include <iostream>
#include <thread>

int array[10];

void func(int i) {
   array[i] = 42;
}

int main() 
{
   for(int i = 0; i < 10; ++i) {
      // spawn func(i) on a separate thread
      // (e.g. with std::async, let me skip the details)
   }
   // join

   for(int i = 0; i < 10; ++i) {
      std::cout << array[i] << std::endl; // prints 42?
   }

   return 0;
}

In this case, is it guaranteed by the language that the writes of different elements of the array do not cause race conditions? And is it guaranteed for any type, or are there any requirements for this to be safe?

Data races only occur on the same memory location, ie there can be a data race on two glvalues x and y only if &x == &y .

[intro.races]/2

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

[intro.races]/21

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions [...]

The remainder of which doesn't apply here. So no there is no data race on your array.

Yes.

From https://en.cppreference.com/w/cpp/language/memory_model :

When an evaluation of an expression writes to a memory location and another evaluation reads or modifies the same memory location, the expressions are said to conflict. A program that has two conflicting evaluations has a data race unless [...]

Then:

A memory location is

  • an object of scalar type (arithmetic type, pointer type, enumeration type, or std::nullptr_t)
  • or the largest contiguous sequence of bit fields of non-zero length

So, if the elements of an array are stored at different memory locations you do not have a conflicting evaluations.

And an array is:

A declaration of the form T a[N]; , declares a as an array object that consists of N contiguously allocated objects of type T.

Since two distinct objects cannot have the same address, they and their constituants cannot have the same memory location. Which guarantees satisfaction of the earlier requirement.

Moreover, objects can consist of more than one memory location, so you could even have two threads operate on different members of the same object!

Please note that for your example to be correct, join has to be written correctly as well, but it's not related to adjacent elements of an array, but rather operating on the same one, so I guess it's beyond the scope of the question.


Personal note: Btw. If this wasn't guaranteed, it would seriously limit if not render useless parallel computing in standard library.

Yes, but 'OK' does not mean it is smart to do it.

There are several issues to consider, perhaps the most important one is CPU caches. On eg x86, cache lines are 64 bytes long, so each thread should eg work on a chunk of the array that matches the cache line length to avoid eg false sharing.

Here is one example: false sharing SO question/answer

It's safe to concurrently access consecutive elements on separate threads , but if it occurs frequently, it could cause performance issues in the code. This is due to the fundamental limitations of parallelism on modern CPUs.

For most programs memory access is a major bottleneck. Performance-critical sections of the code have to be written carefully to avoid excessive cache misses. There are multiple levels to the cache, and each level is faster than the previous one. However, when data is not in the cache, or when data might have been changed by another CPU, it has to be re-loaded into the cache.

The CPU can't keep track of the state of each individual byte, so instead it keeps track of blocks of bytes called Cache Lines. If any byte in a cache line is changed by a separate CPU, it has to be re-loaded to ensure synchronization.

Accessing separate bytes on different threads only causes this re-loading if they're in the same cache line. And because cache lines are contiguous, accessing contiguous elements from separate threads will usually result in the memory having to be re-loaded into the cache. This is called false sharing, and should be avoided in parallel code if performance is a concern.

That being said, if it happens rarely it's probably fine, and you should benchmark and test your code before optimizing it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM