简体   繁体   中英

Seg faults with pthreads_mutex

I am implementing a particle interaction simulator in pthreads,and I keep getting segmentation faults in my pthreads code. The fault occurs in the following loop, which each thread does in the end of each timestep in my thread_routine:

    for (int i = first; i < last; i++)
    {
            get_id(particles[i], box_id);
            pthread_mutex_lock(&locks[box_id.x + box_no * box_id.y]);
            //cout << box_id.x << "," << box_id.y << "," << thread_id << "l" << endl;
            box[box_id.x][box_id.y].push_back(&particles[i]);
            //cout << box_id.x << box_id.y << endl;
            pthread_mutex_unlock(&locks[box_id.x + box_no * box_id.y]);
    }

The strange thing is that if I uncomment one (it doesn't matter which one) or both of the couts, the program runs as expected, with no errors occurring (but this obviously kills performance, and isn't an elegant solution), giving correct output.

box is a globally declared vector < vector < vector < particle_t*> > > box which represents a decomposition of my (square) domain into boxes.

When the loop starts, box[i][j].size() has been set to zero for all i, j, and the loop is supposed to put particles back into the box-structure (the get_id function gives correct results, I've checked)

The array pthread_mutex_t locks is declared as a global

pthread_mutex_t * locks ,

and the size is set by thread 0 and the locks initialized by thread 0 before the other threads are created:

locks = (pthread_mutex_t *) malloc( box_no*box_no * sizeof( pthread_mutex_t ) );

for (int i = 0; i < box_no*box_no; i++)
{
    pthread_mutex_init(&locks[i],NULL);
}

Do you have any idea of what could cause this? The code also runs if the number of processors is set to 1, and it seems like the more processors I run on, the earlier the seg fault occurs (it has run through the entire simulation once on two processors, but this seems to be an exception)

Thanks

This is only an educated guess, but based on the problem going away if you use one lock for all the boxes: push_back has to allocate memory, which it does via the std::allocator template. I don't think allocator is guaranteed to be thread-safe and I don't think it's guaranteed to be partitioned, one for each vector , either. (The underlying operator new is thread-safe, but allocator usually does block-slicing tricks to amortize operator new 's cost.)

Is it practical for you to use reserve to preallocate space for all your vectors ahead of time, using some conservative estimate of how many particles are going to wind up in each box? That's the first thing I'd try.

The other thing I'd try is using one lock for all the boxes, which we know works, but moving the lock/unlock operations outside the for loop so that each thread gets to stash all its items at once. That might actually be faster than what you're trying to do -- less lock thrashing.

Are the box and box[i] vectors initialized properly? You only say the innermost set of vectors are set. Otherwise it looks like box_id 's x or y component is wrong and running off the end of one of your arrays.

What part of the look is it crashing on?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM