Pthreads - turning sequential programs into parallel

Question

I'm simulating "Conway's Game of Life" with C++, where a 2d matrix signifies the board and a 0 is an empty cell while a 1 is a living cell. I originally wrote this sequentially, and tried to make it parallel with pthreads. For some reason though, the program is no longer behaving as expected. While it goes through both loops and seems to pick up on some of the "count++"s, it doesn't pick up all of them, and thus each round the cell is evaluated as only having one or zero neighbors (even when that is not the case). This leads to the "result" after a set time period to be all zeroes, because every cell dies without being able to reproduce. I've been working on this for a couple days and changing up different things but still can't figure it out. Here's my code:

#include <iostream>
#include <vector>
#include <pthread.h>
#include <cstdlib>
#include <functional>
using namespace std;
pthread_mutex_t mymutex;
int lifetime, numthreads = 5;
vector<vector<int> > board,result,pending;

void *loader(void *tid){
    long thid = long(tid);
    int n = board.size();
    result = board;
    int count = 0;
        for(long i = 0; i < n; i ++){
            if(i % numthreads != thid)
                continue;
            for(long j = 0; j < n ; j++){
                if(i % numthreads != thid)
                    continue;
                if(i+1 < n){
                    if(result[i+1][j] == 1) //checking each of the neighbor
                        count++
                        ;
                    if(j+1 < n){
                        if(result[i+1][j+1] == 1)
                            count++;
                    }
                    if(j-1 >= 0){
                        if(result[i+1][j-1] == 1)
                            count++;
                    }
                }
                if(j-1 >= 0){
                    if(result[i][j-1] == 1)
                        count++;
                }
                if(j+1 < n){
                    if(result[i][j+1] == 1)
                        count++;
                }
                if(i-1 >= 0){
                    if(result[i-1][j] == 1)
                        count++;
                    if(j+1 < n){
                        if(result[i-1][j+1] == 1)
                            count++;
                    }
                    if(j-1 >= 0){
                        if(result[i-1][j-1] == 1)
                            count++;
                    }
                }
                //determining next state
                if(count <= 1 || count >= 4){ //this utilizes the three main rules of game
                    pthread_mutex_lock(&mymutex);
                    pending[i][j] = 0;
                    pthread_mutex_unlock(&mymutex);
                }else if(count == 3){
                    pthread_mutex_lock(&mymutex);
                    pending[i][j] = 1;
                    pthread_mutex_unlock(&mymutex);
                }else{
                    pthread_mutex_lock(&mymutex);
                    pending[i][j] = result[i][j];
                    pthread_mutex_unlock(&mymutex);
                }
                count = 0;
                pthread_mutex_lock(&mymutex);
                result = pending;
                pthread_mutex_unlock(&mymutex);
            }
        }
        pthread_exit(NULL);
        return NULL;
}

int main(){
    //setting up input
    int n;
    cin >> n;
    board.resize(n);
    result.resize(n);
    pending.resize(n);
    for(int i = 0; i < board.size(); i++){
        board[i].resize(n);
        result[i].resize(n);
        pending[i].resize(n);
    }
    for(int i = 0; i < n; i++){
        for(int j = 0; j < n; j++){
            cin >> board[i][j];
        }
    }

    cin >> lifetime;

    //making threads, enacting fn
    pthread_t threads[numthreads];
    void *status[numthreads];
    pthread_mutex_init(&mymutex,NULL);
    int rc;
    for(int i = 0; i < lifetime; i++){
        for(int t = 0; t < numthreads; t++){
            rc = pthread_create(&threads[t],NULL,loader,(void *)t);
            if(rc)
                exit(-1);
        }
        for(int t = 0; t < numthreads; t++){
            rc = pthread_join(threads[t],&status[t]);
            if(rc)
                exit(-1);
        }
    }

    for(int i = 0; i < n; i++){
        for(int j = 0; j < n; j++){
            cout << result[i][j] << " ";
        }
        cout << endl;
    }
}

Count private in this, right, because it is created after the threads are initialized? That was the only thing I could think of. Maybe my loops are done incorrectly, but this is the first pthreads program I've written so I'm not sure yet the best way to make a nested for loop.

Answer 1

There's three correctness issues I can see immediately.

First, every thread sets result = board with no locking, and you don't even want to do that every loop anyway. Just have the main thread do that once - subsequent iterations use result as their input.

Second, these nested loops:

for(long i = 0; i < n; i ++){
    if(i % numthreads != thid)
        continue;
    for(long j = 0; j < n ; j++){
        if(i % numthreads != thid)
            continue;
        /* ... */

mean that both the column and the row have to match the thread ID - which means that most of your cells will be skipped. For example, if numthreads is 3 then thread 0 will visit [0][0] , [0][3] , ... and thread 1 will visit [1][1] . [1][4] , ... but no thread will visit [0][1] (because the row matches thread 0, and the column matches thread 1).

You can fix this issue by just dividing up the rows between threads and letting one thread process the entire row:

for(long i = 0; i < n; i ++){
    if(i % numthreads != thid)
        continue;
    for(long j = 0; j < n ; j++){
        /* ... */

Third, every thread is updating result after every cell is processed - this means that some cells are calculating their result based on partial results from other cells, and this doesn't even happen in a deterministic order so the result won't be stable.

You can fix this by removing the code that updates result in the loader() function and putting that inside the lifetime loop in main() , so it just happens once for every step of the game.

There's also a performance issue - you are starting up and stopping a bunch of threads every step of the game. That won't perform very well at all - starting and stopping threads is a heavyweight operation. Once you have it working, you can fix this by having each thread do the lifetime loop and stay running the whole time. You synchronise at each step using pthread_barrier_wait() .

Pthreads - turning sequential programs into parallel

Question

1 answers

solution1
0 2017-03-01 04:20:01

Pthreads - turning sequential programs into parallel

Question

1 answers

solution1 0 2017-03-01 04:20:01

solution1
0 2017-03-01 04:20:01