简体繁体中英

CUDA pinned memory and coalescing

原文 2013-09-30 18:37:06 1 1 c++/ memory/ cuda/ coalescing

On a compute capability 2.x device how would I make sure that the gpu uses coalesced memory access when using mapped pinned memory and assuming that normally when using global memory the 2D data would require padding?

I can't seem to find information about this anywhere, perhaps I should be looking better or perhaps I am missing something. Any pointers in the right direction are welcome...

1 answers

The coalescing approach should be applied when using zero copy memory . Quoting the CUDA C BEST PRACTICES GUIDE:

Because the data is not cached on the GPU, mapped pinned memory should be read or written only once, and the global loads and stores that read and write the memory should be coalesced.

Quoting the "CUDA Programming" book, by S. Cook

If you think about what happens with access to global memory, an entire cache line is brought in from memory on compute 2.x hardware. Even on compute 1.x hardware the same 128 bytes, potentially reduced to 64 or 32, is fetched from global memory. NVIDIA does not publish the size of the PCI-E transfers it uses, or details on how zero copy is actually implemented. However, the coalescing approach used for global memory could be used with PCI-E transfer. The warp memory latency hiding model can equally be applied to PCI-E transfers, providing there is enough arithmetic density to hide the latency of the PCI-E transfers.

Double-checking understanding of memory coalescing in CUDA

Cuda: pinned memory zero copy problems

Why is CUDA pinned memory so fast?

CUDA and pinned (page locked) memory not page locked at all?

CUDA Pinned memory implementation error cannot set when device is active in this process

OpenCL Memory Bandwidth/Coalescing

Coalescing two memory chunks in C++?

CUDA: Find out if host buffer is pinned (page-locked)

How to properly use pinned memory in ArrayFire?

AMD_pinned_memory - Syncing transfer into a texture

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Double-checking understanding of memory coalescing in CUDA Cuda: pinned memory zero copy problems Why is CUDA pinned memory so fast? CUDA and pinned (page locked) memory not page locked at all? CUDA Pinned memory implementation error cannot set when device is active in this process OpenCL Memory Bandwidth/Coalescing Coalescing two memory chunks in C++? CUDA: Find out if host buffer is pinned (page-locked) How to properly use pinned memory in ArrayFire? AMD_pinned_memory - Syncing transfer into a texture

Related Tags

CUDA pinned memory and coalescing

Question

1 answers

solution1 2 2013-09-30 20:11:46

solution1
2 2013-09-30 20:11:46