简体   繁体   English

将char数组传递给CUDA内核

[英]Passing char array to CUDA Kernel

I am trying to pass an char array containing 10000 words read from a txt file in the main function to CUDA kernel function. 我正在尝试将包含从主要功能中的txt文件读取的10000个单词的char数组传递给CUDA内核功能。

The words are transferred from the host to device like this: 单词从主机传输到设备,如下所示:

(main function code:) (主要功能代码:)

//.....
     const int text_length = 20;

     char (*wordList)[text_length] = new char[10000][text_length];
     char *dev_wordList;

     for(int i=0; i<number_of_words; i++)
     {
         file>>wordList[i];
         cout<<wordList[i]<<endl;
     }

     cudaMalloc((void**)&dev_wordList, 20*number_of_words*sizeof(char));
     cudaMemcpy(dev_wordList, &(wordList[0][0]), 20 * number_of_words * sizeof(char), cudaMemcpyHostToDevice);

    //Setup execution parameters
    int n_blocks = (number_of_words + 255)/256;
    int threads_per_block = 256;


    dim3 grid(n_blocks, 1, 1);
    dim3 threads(threads_per_block, 1, 1);


    cudaPrintfInit();
    testKernel<<<grid, threads>>>(dev_wordList);
    cudaDeviceSynchronize();
    cudaPrintfDisplay(stdout,true);
    cudaPrintfEnd();

(kernel function code:) (内核功能代码:)

__global__ void testKernel(char* d_wordList)
{
    //access thread id
    const unsigned int bid = blockIdx.x;
    const unsigned int tid = threadIdx.x;
    const unsigned int index = bid * blockDim.x + tid;

    cuPrintf("!! %c%c%c%c%c%c%c%c%c%c \n" , d_wordList[index * 20 + 0],
                                            d_wordList[index * 20 + 1],
                                            d_wordList[index * 20 + 2],
                                            d_wordList[index * 20 + 3],
                                            d_wordList[index * 20 + 4],
                                            d_wordList[index * 20 + 5],
                                            d_wordList[index * 20 + 6],
                                            d_wordList[index * 20 + 7],
                                            d_wordList[index * 20 + 8],
                                            d_wordList[index * 20 + 9]);
}

Is there a way to manipulate them easier? 有没有一种方法可以更轻松地操作它们? (I would like to have a word per element/position) I tried with <string> , but I can't use them in CUDA device code. (我想每个元素/位置有一个单词)我尝试过<string> ,但是我不能在CUDA设备代码中使用它们。

cuPrintf("%s\n", d_wordlist+(index*20));

should work? 应该管用? (provided your strings are zero-terminated) (前提是您的字符串以零结尾)

Update: 更新:

This line: 这行:

char (*wordList)[text_length] = new char[10000][text_length];

looks strange to me. 我看起来很奇怪 In general, array of pointers to char would be allocated like this: 通常,指向char的指针数组将如下分配:

char** wordList = new char*[10000];
for (int i=0;i<10000;i++) wordList[i] = new char[20];

In this case, wordList[i] would be a pointer to string number i. 在这种情况下,wordList [i]将是指向字符串编号i的指针。

Update #2: 更新#2:

If you need to store your strings as a consecutive block, and you are sure that none of your strings exceeds text_length+1, then you can do like that: 如果您需要将字符串存储为连续的块,并且确定没有任何字符串超过text_length + 1,则可以这样做:

char *wordList = new char[10000*text_length];

for(int i=0; i<number_of_words; i++)
     {
         file>>wordList+(i*text_length);
         cout<<wordList+(i*text_length)<<endl;
     }

In that case, wordList + (i*text_length) will point to the beginning of your string number i, and it will be 0-terminated because that's how you read it from the file, and you will be able to print it out with the way specified in this answer. 在这种情况下,wordList +(i * text_length)将指向字符串编号i的开头,并且将以0终止,因为这是您从文件中读取它的方式,并且可以使用此答案中指定的方式。 If any of your strings is longer than text_length-1, however, you will still get issues. 但是,如果任何字符串长于text_length-1,您仍然会遇到问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM