简体   繁体   English

CUDA 循环中的空间局部性

[英]Spatial locality in CUDA loops

I was reading the Even Easier Introduction to CUDA, and I was thinking about examples like this:我正在阅读更简单的 CUDA 简介,我正在考虑这样的示例:

__global__
void add(int n, float *x, float *y)
{
  int index = threadIdx.x;
  int stride = blockDim.x;
  for (int i = index; i < n; i += stride)
      y[i] = x[i] + y[i];
}

In which each thread strides through the array.其中每个线程跨过数组。 In normal CPU computing, one would rather split the array into contiguous sub-arrays that are splitted among the threads, so that they can each better exploit spatial locality.在正常的 CPU 计算中,人们宁愿将数组拆分为连续的子数组,这些子数组在线程之间进行拆分,以便它们可以更好地利用空间局部性。

Does this concept apply to CUDA's unified memory as well?这个概念是否也适用于 CUDA 的统一 memory? I would like to understand what the most efficient approach would be in such a situation.我想了解在这种情况下最有效的方法是什么。

The reason a grid-stride loop is beneficial for memory access is that it promotes "coalesced" access to global memory . 网格步长循环有利于 memory 访问的原因是它促进了对全局 memory 的“合并”访问 In a nutshell, coalesced access means that adjacent threads in the warp are accessing adjacent locations in memory, on any given read or write cycle/operation, considered warp-wide.简而言之,合并访问意味着warp中的相邻线程正在访问 memory 中的相邻位置,在任何给定的读或写周期/操作上,被认为是 warp 范围的。

The grid-stride loop arranges the indices across the warp to promote this pattern.网格步长循环将索引排列在经线上以促进这种模式。

This is orthogonal to whether the memory was allocated with an "ordinary" device allocator (eg cudaMalloc ) or a "unified" allocator (eg cudaMallocManaged ).这与 memory 是否分配有“普通”设备分配器(例如cudaMalloc )或“统一”分配器(例如cudaMallocManaged )是正交的。 In either case, the best way for device code to access such an allocation is using coalesced access.在任何一种情况下,设备代码访问此类分配的最佳方式是使用合并访问。

You didn't ask about it, but CUDA shared memory also has one of its "optimal access patterns" consisting of adjacent threads in the warp accessing adjacent locations in (shared) memory.您没有问过它,但是CUDA 共享 memory也具有其“最佳访问模式”之一,该模式由经线中的相邻线程访问(共享)中的相邻位置 ZCD69B4957F06CD818D7BF3D61980E2.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM