简体   繁体   中英

String Matching GPU CUDA Error

This is homework, but the as the tag is deprecated i'm pointing this out here...

I'm working on an assignment using cuda that does a strightforward match of a pattern in a string. The text file contains 1,000,000 chars, (all the same char, but the last is different) and a pattern of size 100 (again all the same char, with the final one different), so the pattern should be found at position 999,000 of the text.

I am trying to get this to work with 10 threads, and so I am setting the starting points of the algorithm accordingly.

blocksize is set to 10,000 and the startPoint variable is the thread id (0-9).

int i,j,k,lastI;

    i=startPoint*blockSize;
    j=0;
    k=startPoint*blockSize; //may be -1...

    int end;
    end = ((startPoint+1) * blockSize) - patternLength; //may be -1

    //*testchar = dev_textData[((startPoint+1) * blockSize) -1];
    *testchar = dev_pattData[patternLength-1];
    *testchar = dev_textData[textLength-1];

    //*testchar = dev_textData[i+blockSize-1];
    //*result = end;
    //return;
    while (i<=end && j<patternLength)
    {
        if (dev_textData[k] == dev_pattData[j]) //going out of bounds at the j i think...
        {
            k++;
            j++;

        }
        else
        {
            i++;
            k=i;
            j=0;

        }
    }

    if (j == patternLength)
    {
        *result = i;
        *testchar = 'f';
    }
    else
    {
        *result = -1;

Firstly the program here seems to error, with the cuda error 30, unknown error (I think this is a segfault perhaps??), but when I change

            if (dev_textData[k] == dev_pattData[j])

to

            if (dev_textData[k] == dev_pattData[j-1])

The error disappears, however because i'm matching on the last char the algo does not work correctly.

I can't seem to figure out why the j-1 makes a difference here because of the while loop boundary.

Any help / advice / pointers would be greatly appreciated.

Thanks

First, let's do the math. If you have 1,000,000 chars and the pattern length is 100, then the pattern should be found at 999,900. If you split the work between 10 threads, then each thread should be given 100,000 bytes. The reason I'm giving you a hard time is that I have to wonder whether the pattern length actually matches the pattern. In other words, does the pattern actually have 100 bytes in it, or does it only have 99 bytes?

One way to debug problems like this is to

  • take your original code
  • place it in a test environment with a tiny dataset
  • strip out all of the distracting nonsense
  • add some printf's for debugging

Here's what the code looks like after doing that

int i,j,k,end;     
char textData[10] = "aaaaaaaaab";
char pattData[5]  = "aaaab";
int blockSize = 10;
int patternLength = 5;
int startPoint = 0;

i=startPoint*blockSize;
j=0;
k=startPoint*blockSize; 

end = ((startPoint+1) * blockSize) - patternLength; 

while (i<=end && j<patternLength)
{
    printf( "i=%d j=%d k=%d -- ", i, j, k );

    if (textData[k] == pattData[j]) 
    {
        k++;
        j++;
        printf( "match newi=%d newj=%d newk=%d\n", i, j, k );
    }
    else
    {
        i++;
        k=i;
        j=0;
        printf( "fail  newi=%d newj=%d newk=%d\n", i, j, k );
    }
}
printf( "end-of-loop i=%d j=%d k=%d\n", i, j, k );

if (j == patternLength)
{
    printf( "pattern found at %d\n", i );
}
else
{
    printf( "not found\n" );
}

And guess what ... the code works!!! So the problem has nothing to do with the core algorithm, but is somewhere else in your code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM