简体   繁体   中英

Using __constant__ memory with MPI and streams

If I have a __constant__ value

__constant__ float constVal;

Which may or may not be initialized by MPI ranks on non-blocking streams:

cudaMemcpyToSymbolAsync((void*)&constVal,deviceValue,sizeof(float),0,cudaMemcpyDeviceToDevice,stream);

Is this:

  1. Safe to be accessed by multiple MPI ranks simultaneously within kernels? Ie do ranks share the same instance of val or do MPI semantics (they all have a private copy) still hold?
  2. If the above is safe, is it safe to be initialized by multiple MPI ranks?
  1. Safe to be accessed by multiple MPI ranks simultaneously within kernels? Ie do ranks share the same instance of val or do MPI semantics (they all have a private copy) still hold?

Neither. CUDA contexts are not shared amongst processes. If you have multiple processes you get multiple contexts, and each context has its own copy of all the statically defined symbols and code. This behaviour is independent of MPI semantics. If you are imagining that multiple processes in an MPI communicator are sharing the same GPU context and state, they aren't.

  1. If the above is safe, is it safe to be initialized by multiple MPI ranks?

It isn't only safe, it is mandatory.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM