简体   繁体   English

线程在 C pthreads 中不能同时工作

[英]Threads not working concurrently in C pthreads

So I have a quick sort algorithm that I want to run in two different threads, the idea is to have 2 independent halves of the array being sorted at once (this should not be a problem by quick sort nature of partition).所以我有一个快速排序算法,我想在两个不同的线程中运行,这个想法是让数组的两个独立的一半被一次排序(这不应该是分区的快速排序性质的问题)。

My code is as follows:我的代码如下:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void troca (int *v, int i, int j);
int partition (int *v, int ini, int fim);
void quicksort (int *v, int ini, int fim, int size);

typedef struct {
    int ini, mid, fim;
} thread_arg;

int size;
int *v;

void *thread_func(void* arg){
    thread_arg *a = arg;
    quicksort(v, a->ini, a->mid, a->fim);
    printf("done in\n");
    return NULL;
}

int main()
{
    // initializations
    scanf("%d", &size);
    v = malloc(size * sizeof(int));

    // read array
    for (int i = 0; i < size; ++i)
        scanf("%d", &v[i]);

    // arguments
    thread_arg argument1, argument2;
    int mid = partition(v, 0, size-1);

    argument1.ini = 0;
    argument1.mid = mid-1;
    argument1.fim = size;

    argument2.ini = mid;
    argument2.mid = size-1;
    argument2.fim = size;

    pthread_t thread1, thread2;    

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1\n");
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2\n");

    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    free(v);

    return 0;
}

void quicksort (int *v, int ini, int fim, int size){
    if (ini >= fim)
        return;
    int meio = partition(v, ini, fim);
    quicksort(v, ini, meio-1, size);
    quicksort(v, meio+1, fim, size);
}

int partition (int *v, int ini, int fim){
    int i = ini-1, j = ini;
    troca(v, (ini+fim)/2, fim);
    int pivo = v[fim];

    for (; j < fim; ++j)
    {
        if (v[j] <= pivo)
        {
            i++;
            troca(v, i, j);
        }
    }
    troca(v, i+1, fim);
    return i+1; //indice do pivo;
}


void troca (int *v, int i, int j){
    int aux = v[i];
    v[i] = v[j];
    v[j] = aux;
    return;
}

The execution works and sorts perfectly, it does generates 2 new independent threads that sorts halves of the array each.执行工作和排序完美,它确实生成了 2 个新的独立线程,每个线程对数组的一半进行排序。 The problem is, it does not do it at the same time.问题是,它不会同时进行。 Running the program with a input of 100m random numbers:输入 100m 个随机数运行程序:

done out 1
done out 2
done in
done in

real    0m47,464s
user    0m50,686s
sys     0m0,452s

But it takes about 25 seconds for the first 3 lines to appear, and ~25 for the last one, which indicates that the second thread is waiting for the first one to run.但是前 3 行出现大约需要 25 秒,最后一行大约需要 25 秒,这表明第二个线程正在等待第一个运行。

In the htop console, it appears that in some point both run at the same time (this is backed at the fact that this program runs a bit faster than my normal one)htop控制台中,似乎在某些时候两者都同时运行(这是因为该程序运行得比我的正常程序快一点)

Finally, I understand that it is not safe to work simultaneously with this sort of data, but on this sorting example it should be fine.最后,我知道同时处理这类数据是不安全的,但在这个排序示例中应该没问题。

The threads are running concurrently (well, not necessarily concurrently, but perceivably concurrently).线程同时运行(嗯,不一定是同时运行的,但可以感知是同时运行的)。 The 25 second delay you are seeing is due to a bug in your quick sort (or perhaps the way you're sharing the list between your two threads).您看到的 25 秒延迟是由于快速排序中的错误(或者可能是您在两个线程之间共享列表的方式)。 Essentially, thread 2 is assigned much more work than thread 1, and so it takes much longer to complete.本质上,线程 2 比线程 1 分配更多的工作,因此它需要更长的时间才能完成。 Thread 2 is not simply executing "after" thread 1, or "waiting" for thread 1.线程 2 不是简单地执行“之后”线程 1,或者“等待”线程 1。

To prove this, I added an unsigned long* argument to quicksort and had it increment the value referenced by said pointer at each call (essentially counting the number of times each thread calls quicksort ), and I ran it with 10M (not 100M) random values.为了证明这一点,我在quicksort中添加了一个unsigned long*参数,并让它在每次调用时递增所述指针引用的值(基本上计算每个线程调用quicksort的次数),然后我用 10M(不是 100M)随机运行它价值观。 Thread 1's count ended up at 3,851,991, and thread 2's count ended up at 9,693,697.线程 1 的计数最终为 3,851,991,线程 2 的计数最终为 9,693,697。 Sure, there can be some small variation between the two counts due to the randomness in the generation of the list.当然,由于列表生成的随机性,这两个计数之间可能会有一些小的变化。 But the difference is nearly a factor of 3, which is far more significant than you could possibly expect from slight random variations.但差异几乎是 3 倍,这远比您从轻微随机变化中所预期的要重要得多。

I suggest you try a different implementation of quicksort (one which is known to work).我建议您尝试另一种quicksort的实现(一种已知的工作方式)。 I also suggest being more careful with data sharing (make sure the two threads never access each others' data, especially without synchronization) to get a more accurate measure of timing;我还建议在数据共享方面更加小心(确保两个线程永远不会访问彼此的数据,尤其是在没有同步的情况下)以获得更准确的时间测量; the last thing you want is for one thread to be sorting or unsorting the other thread's data.您想要的最后一件事是让一个线程对另一个线程的数据进行排序或取消排序。

You don't divide the work fairly among the threads.您不会在线程之间公平地分配工作。 To see this, modify your code as follows:要看到这一点,请按如下方式修改您的代码:

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1 (%d)\n", argument1.mid - argument1.ini + 1);
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2 (%d)\n", argument2.mid - argument2.ini + 1);

You will see that one thread tends to have about twice as much work as the other.您会看到一个线程的工作量往往是另一个线程的两倍。

For example, here are a few runs using random data:例如,这里有一些使用随机数据的运行:

done out 1 (66474145)完成 1 (66474145)
done out 2 (33525855)完成 2 (33525855)

done out 1 (21794872)完成 1 (21794872)
done out 2 (78205128)完成 2 (78205128)

done out 1 (67867800)完成 1 (67867800)
done out 2 (32132200)完成 2 (32132200)

You should never divide your work into a small number of very big chunks and assign each chunk to a thread if you care about concurrency.如果您关心并发性,则永远不应该将您的工作分成少量非常大的块并将每个块分配给一个线程。 Instead, create a queue of small tasks and let threads pull tasks from the queue as they finish.相反,创建一个小任务队列,并让线程在完成时从队列中拉出任务。

The thread creation is not fair: thread#1 gets created before thread#2.线程创建不公平:线程#1 在线程#2 之前创建。 Moreover, when thread#1 is created, it may run and preempt the main thread which may wait for it to give back the CPU to create and start thread#2.此外,当线程#1 被创建时,它可能会运行并抢占主线程,主线程可能会等待它返回 CPU 以创建和启动线程#2。 However, the thread running under the default SCHED_OTHER policy have an unpredictable behavior.但是,在默认SCHED_OTHER策略下运行的线程具有不可预知的行为。

To add some predictability:添加一些可预测性:

  • Make the threads start at the same time once created.使线程在创建后同时启动。 Use a barrier to trigger a "go" for all the threads at the same time.使用屏障同时触发所有线程的“执行”。 Cf.参照。 pthread_barrier_init() pthread_barrier_init()
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void troca (int *v, int i, int j);
int partition (int *v, int ini, int fim);
void quicksort (int *v, int ini, int fim, int size);



pthread_barrier_t barrier;

typedef struct {
    int id;
    int ini, mid, fim;
} thread_arg;

int size;
int *v;

void *thread_func(void* arg){
    thread_arg *a = arg;
    pthread_barrier_wait(&barrier);
    quicksort(v, a->ini, a->mid, a->fim);
    printf("done in %d\n", a->id);
    return NULL;
}

int main()
{
    // initializations
    scanf("%d", &size);
    v = malloc(size * sizeof(int));

    // read array
    for (int i = 0; i < size; ++i)
        scanf("%d", &v[i]);

    // arguments
    thread_arg argument1, argument2;
    int mid = partition(v, 0, size-1);

    argument1.id = 1;
    argument1.ini = 0;
    argument1.mid = mid-1;
    argument1.fim = size;

    argument2.id = 2;
    argument2.ini = mid;
    argument2.mid = size-1;
    argument2.fim = size;

    pthread_t thread1, thread2;    

    pthread_barrier_init(&barrier, NULL, 3);

    // thread and execution
    pthread_create(&thread1, NULL, thread_func, &argument1);
    printf("done out 1\n");
    pthread_create(&thread2, NULL, thread_func, &argument2);
    printf("done out 2\n");

    // Start the threads
    pthread_barrier_wait(&barrier);

    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    free(v);
    pthread_barrier_destroy(&barrier);

    return 0;
}

void quicksort (int *v, int ini, int fim, int size){
    if (ini >= fim)
        return;
    int meio = partition(v, ini, fim);
    quicksort(v, ini, meio-1, size);
    quicksort(v, meio+1, fim, size);
}

int partition (int *v, int ini, int fim){
    int i = ini-1, j = ini;
    troca(v, (ini+fim)/2, fim);
    int pivo = v[fim];

    for (; j < fim; ++j)
    {
        if (v[j] <= pivo)
        {
            i++;
            troca(v, i, j);
        }
    }
    troca(v, i+1, fim);
    return i+1; //indice do pivo;
}


void troca (int *v, int i, int j){
    int aux = v[i];
    v[i] = v[j];
    v[j] = aux;
    return;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM