简体   繁体   中英

Performance cost to multiple OpenMP threads reading (not writing) a shared variable?

In OpenMP (I am using C++), is there a performance cost if you have a shared (or even global) variable that is being repeatedly read (not written) by multiple threads? I am aware that if they were writing to the variable, this would be incorrect. I am asking specifically about reading only - is there a potential performance cost if multiple threads are repeatedly reading the same variable?

If the variable (more precise memory location) is only read by all threads, you are basically fine both in terms of correctness and performance. Cache protocols have a "shared" state - so the value can be cached on multiple cores.

However, you should also avoid to write data on the same cache line than the variable, as this would invalidate the cache for other cores. Also on a NUMA system you have to consider that it may be more expensive to read some memory regions for certain cores/threads.

If you're only reading, then you have no safety issues. Everything will work fine. By definition, you don't have Race Conditions . You don't need to do any locking, so no high-contention problems can happen. You can test thread safety at run-time using the Clang ThreadSanitizer .

On the other hand, there are some performance issues to be aware about. Try to avoid false sharing by making every thread (or preferably all threads) access a bunch of data that's consecutive in memory at a time. This way, when the CPU cache loads data, it'll not require to access memory multiple times every instant. Accessing memory is considered very expensive (hundreds of times slower, at least) compared to accessing CPU cache.

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM