简体   繁体   English

设备内存中的多个指针,用于CUDA中的单个分配数组

[英]multiple pointers on device memory for single allocated array in cuda

I am wondering if there is it possible to setup multiple pointers to single data already allocated in memory? 我想知道是否可以为指向已在内存中分配的单个数据设置多个指针? the reason i am asking this is because i was implementing lexographical sorting with gpu with the help of thrust vectors (and failed miserably in terms of time) 我问这个的原因是因为我在推力矢量的帮助下使用gpu实现了字典排序(并且在时间上惨败)

for example i am trying to acheive equivalent of these c++ statments 例如,我试图达到这些C ++语句的等效值

unsigned int * pword;      //setting up the array of memory for permutations of word
pword = new unsigned int [N*N];

unsigned int* * p_pword;    //pointers to permutation words
p_pword = new unsigned int* [N];

//setting up the pointers on the locations such that if N=4 then 0,4,8,12,...
int count;
for(count=0;count<N;count++)
        p_pword[count]=&pword[count*N];

I am not asking for someone to provide me with code, i just want to know is there any way i can setup pointers to single array of data. 我并不是要有人向我提供代码,我只是想知道我可以通过什么方式设置指向单个数据数组的指针。 PS: i have tried the following method but not achieving any speedup at all PS:我尝试了以下方法,但根本没有实现任何加速

int * raw_ptr = thrust::raw_pointer_cast(&d_Data[0]); //doing same with multiple pointers

but i guess due to the fact that i am pointing towards device_vector it might be the problem of slow accessing 但我想由于我指向device_vector的事实,这可能是访问缓慢的问题

Any help in this regard is highly appreciated. 在这方面的任何帮助都将受到高度赞赏。

Well this doesn't make any sense: 嗯,这没有任何意义:

int * raw_ptr = thrust::raw_pointer_cast([0]);
                                          ^ what is this??

I don't think that line would compile correctly. 我认为该行不能正确编译。

But in thrust you can certainly do something like this: 但是,您当然可以做这样的事情:

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/device_ptr.h>
#include <thrust/sequence.h>

int main(){

  int N=16;
  thrust::device_vector<int> d_A(4*N);
  thrust::sequence(d_A.begin(), d_A.end());
  thrust::device_ptr<int> p_A[N];
  for (int i=0; i<N; i++)
    p_A[i] = &(d_A[4*i]);
  thrust::host_vector<int> h_A(N);
  thrust::copy(p_A[4], p_A[8], h_A.begin());
  for (int i=0; i<N; i++)
    printf("h_A[%d] = %d\n", i, h_A[i]);
  return 0;
}

Not sure what to say about speedup. 不知道该说些什么提速。 Speedup in the context of the tiny little snippet of code you've posted doesn't make much sense to me. 在您发布的一小段代码中,加速对我来说没有多大意义。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM