使用互斥时的数据竞争

Question

我正在用C进行一个涉及线程和互斥量的小型项目。 我正在开发的程序将过滤器应用于bmp法师。 该项目的目标是实现一个能够处理此命令行的程序：

$ ./filter -f filter1[,filter2[,...]] -t numThreads1[,numThreads2[,...]] input-folder output-folder

其中-f是我要应用的过滤器（“红色”，“蓝色”，“绿色”，“灰度”和“模糊”），而-t是每个过滤器分配的线程数。

到目前为止，除模糊之外，其他一切都很好，因为我陷入了数据竞争（或者，我认为是这样）。 模糊滤镜的工作方式如下：

/* Add a Gaussian blur to an image using
* this 3X3 matrix as weights matrix:
*   0.0  0.2  0.0
*   0.2  0.2  0.2
*   0.0  0.2  0.0
*
* If we consider the red component in this image
* (every element has a value between 0 and 255)
*
*   1  2  5  2  0  3
*      -------
*   3 |2  5  1| 6  0       0.0*2 + 0.2*5 + 0.0*1 +
*     |       |
*   4 |3  6  2| 1  4   ->  0.2*3 + 0.2*6 + 0.2*2 +   ->  3.2
*     |       |
*   0 |4  0  3| 4  2       0.0*4 + 0.2*0 + 0.0*3
*      -------
*   9  6  5  0  3  9
* 
* The new value of the pixel (3, 4) is round(3.2) = 3.
*
* If a pixel is outside the image, we increment the central pixel weight by 0.2
* So the new value of pixel (0, 0) is:
*   0.2 * 0 + 0.2 * 9 + 0.2 * 6 + 0.2 * 9 + 0.2 * 9 = 6.6 -> 7
*/

问题是，当我使用模糊滤镜在“棋盘”图像上运行程序时：

$ ./filter -f blur -t 8 chess.bmp chessBlur.bmp

我期待得到这个形象，但我发现这个（“破”行随机变化）

我正在使用互斥锁来锁定和解锁关键部分，但是如您所见，数据争用仍然发生。 在过滤器上仅需两个字，我一次就给每个线程从下至上一行。 我的filter_blur代码是：

int filter_blur(struct image *img, int nThread)
{
    int error = 0;
    int mod = img->height%nThread;
    if (mod > 0)
        mod = 1;

    pthread_t threads[nThread];
    pthread_mutex_t mutex;
    args arguments[nThread];

    struct image* img2 = (struct image*)malloc(sizeof(struct image));
    memcpy(img2,img,sizeof(struct image));

    error=pthread_mutex_init( &mutex, NULL);
    if(error!=0)
        err(error,"pthread_mutex_init");

    int i = 0;
    for (i=0; i<nThread; i++) {
        arguments[i].img2 = img2;
        arguments[i].mutex = &mutex;
    }

    int j = 0;
    for (i=0; i<(img->height)/nThread + mod; i++) {
        for (j=0; j<nThread; j++) {

            arguments[j].img = img; arguments[j].line = i*nThread + j;

            error=pthread_create(&threads[j],NULL,threadBlur,(void*)&arguments[j]);
            if(error!=0)
                err(error,"pthread_create");
        }
        for (j=0; j<nThread; j++) {
            error=pthread_join(threads[j],NULL);
            if(error!=0)
                err(error,"pthread_join");
        }
    }
    free(img2);
    return 0;
}

void* threadBlur(void* argument) {

    // unpacking arguments
    args* image = (args*)argument;
    struct image* img = image->img;
    struct image* img2 = image->img2;
    pthread_mutex_t* mutex = image->mutex;

    int error;
    int line = image->line;
    if (line < img->height) {
        int i;

        error=pthread_mutex_lock(mutex);
        if(error!=0)
            fprintf(stderr,"pthread_mutex_lock");

        for (i=0; i<img->width; i++) {
            img->pixels[line * img->width +i] = blur(img2,i,line);
        }

        error=pthread_mutex_unlock(mutex);
        if(error!=0)
            fprintf(stderr,"pthread_mutex_unlock");
    }
    pthread_exit(NULL);
}

struct pixel blur(struct image* img2, int x, int y) {
    double red = 0;
    double green = 0;
    double blue = 0;

    red=(double)img2->pixels[y * img2->width + x].r/5.0;
    green=(double)img2->pixels[y * img2->width + x].g/5.0;
    blue=(double)img2->pixels[y * img2->width + x].b/5.0;

    if (x != 0) {
        red+=(double)img2->pixels[y * img2->width + x - 1].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x - 1].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x - 1].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    if (x != img2->width - 1) {
        red+=(double)img2->pixels[y * img2->width + x + 1].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x + 1].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x + 1].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    if (y != 0) {
        red+=(double)img2->pixels[(y - 1) * img2->width + x].r/5.0;
        green+=(double)img2->pixels[(y - 1) * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[(y - 1) * img2->width + x].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    if (y != img2->height - 1) {
        red+=(double)img2->pixels[(y + 1) * img2->width + x].r/5.0;
        green+=(double)img2->pixels[(y + 1) * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[(y + 1) * img2->width + x].b/5.0;
    } else {
        red+=(double)img2->pixels[y * img2->width + x].r/5.0;
        green+=(double)img2->pixels[y * img2->width + x].g/5.0;
        blue+=(double)img2->pixels[y * img2->width + x].b/5.0;
    }

    struct pixel pix = {(unsigned char)round(blue),(unsigned char)round(green),(unsigned char)round(red)};
    return pix;
}

编辑1：

正如@job正确猜测的那样，问题是由我的结构的memcpy引起的（复制了结构，但是结构内部的指针仍然指向原始结构元素）。 我还删除了互斥锁（它们之所以在这里，是因为我让它们可以解决我的问题，对不起，我的坏处）。现在，我的项目正在发挥作用（即使我们仍然可以讨论处理速度以及使用线程的必要性））。 正如我说的，这是一个项目，是我的C类的大学项目。 目标是并行化我们的过滤器。 因此需要线程。

谢谢！

Answer 1

好吧，这与许多关于您的代码的观察结果相比，并不是一个答案：

您似乎实际上并没有从程序中任何位置的多个线程访问一个特定的存储单元。 因此，似乎不需要使用mutices。
或者，线程可能访问相同的内存段。 在这种情况下，仅由一个线程执行所有计算，您的程序很有可能会效率更高。 您应该对这种情况进行基准测试，并将其与线程版本进行比较。
至少对我来说，没有明显的理由为什么这里需要多线程。 如果您在单个线程中进行这些浮点计算，那么它们很可能在操作系统甚至无法生成第二个线程之前就已经完成。 与线程创建的开销时间相比，工作量微不足道。
您当前的多线程设计存在缺陷，所有工作都在互斥锁保护的代码内进行。 在互斥锁之外没有可以完成的实际工作，因此，无论您创建1000个线程，一次只能执行1个线程，其他线程都将处于睡眠状态，等待它们的执行。

Answer 2

首先，非常感谢您的帮助！ 由于您的回答，我设法修复了我的代码:-)

大量评论指出了互斥锁的无用性，我也认为它们更像是程序性能的瓶颈，而不是解决问题的方法。 我之所以添加它们是因为我希望它们能够神奇地解决我的问题（有时在编程中会发生奇迹）。 现在它们不见了（它们本来就不应该出现），并且代码更快！

回到原来的问题！ 对于模糊滤镜的应用，我需要图像的只读副本。 为了获得此副本，我使用了memcpy，如下所示：

struct image* img2 = (struct image*)malloc(sizeof(struct image));
memcpy(img2,img,sizeof(struct image));

但是正如@jop指出的那样，即使我正在复制img ，也指向复制的img2内部分配的内存的指针pixels仍指向原始数组。 因此，代替复制img ，而不是复制img->pixels 。 我通过以下方式修改了代码：

struct pixel* pixels = (struct pixel*)malloc(sizeof(struct pixel)*img->width*img->height);
memcpy(pixels,img->pixels,sizeof(struct pixel)*img->width*img->height);

而且，问题解决了！ 所以谢谢大家！

一些评论还讨论了使用或不使用线程的需求。 好吧，在这种情况下，我别无选择，因为该项目的目标是编写一些并行化的图像过滤器。 是的，需要线程！

使用互斥时的数据竞争

问题描述

2 个解决方案

解决方案1
0 2013-03-08 12:31:55

解决方案2
0 已采纳 2013-03-08 20:17:22

使用互斥时的数据竞争

问题描述

2 个解决方案

解决方案1 0 2013-03-08 12:31:55

解决方案2 0 已采纳 2013-03-08 20:17:22

解决方案1
0 2013-03-08 12:31:55

解决方案2
0 已采纳 2013-03-08 20:17:22