简体   繁体   中英

issue with std::map std::find

I have working c++ deamon. The problem is that daemon crash one or two times per month. As you see from GDB output below its happens when daemon search for unsigned int sessionID at std::map <unsigned int const, SessionID*> container. I can not reproduce the problem and think that its maybe something wrong with data from users (maybe std::sting cookie_ssid have some unexpected data and after conversion with strtoul something going wrong. ( know, that this is not the correct way to get unsigned int )

I have only .core file after daemon crash. and see that problem at if (!_M_impl._M_key_compare(_S_key(__x), __k)) . Any ideas how to solve this problem? Many thanks.

GDB output:

#0  std::_Rb_tree<unsigned int, std::pair<unsigned int const, SessionID*>, std::_Select1st<std::pair<unsigned int const, SessionID*> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, SessionID*> > >::find (this=0x529df15c, __k=@0x7f5fab7c)
    at stl_tree.h:1376
##########
1376            if (!_M_impl._M_key_compare(_S_key(__x), __k))
##########
#0  std::_Rb_tree<unsigned int, std::pair<unsigned int const, SessionID*>, std::_Select1st<std::pair<unsigned int const, SessionID*> >, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, SessionID*> > >::find (
    this=0x529df15c, __k=@0x7f5fab7c) at stl_tree.h:1376
#1  0x0805e6be in TR::find_session (this=0x529df110,
    cookie_ssid=@0x47ef3614, ptr_to_ptr_session=0x7f5fac7c)
    at stl_map.h:542

Function TR::find_session released like this:

bool TR::find_session ( const std::string &cookie_ssid, SessionID **ptr_to_ptr_session )
{
    unsigned int uint_sessionid = std::strtoul ( cookie_ssid.c_str(),NULL,0);

MUTEX_map_sessionids.lock_reading();
    std::map<unsigned int, SessionID*>::iterator it_sessionids = map_sessionids.find( uint_sessionid );

    if ( it_sessionids != map_sessionids.end() )
    { // exists
        *ptr_to_ptr_session = it_sessionids->second;
        MUTEX_map_sessionids.unlock();
        return true;
    }

MUTEX_map_sessionids.unlock();  
return false;   
}

EDIT My cleanup function that working at separated thread (one time per minute or 5 minutes). As requested at comments. I am not sure with this function. Maybe its buggy....

void TR::cleanup_sessions () // not protected from multithread using! used only at one thread
{
std::list<SessionID*> list_to_clean; // tmplary store sessions to delete

MUTEX_map_sessionids.lock_reading();
std::map<unsigned int, SessionID*>::iterator it_sessionids = map_sessionids.begin();
MUTEX_map_sessionids.unlock();

while ( true )
{
    MUTEX_map_sessionids.lock_writing();
    if (it_sessionids == map_sessionids.end() )
    {
        MUTEX_map_sessionids.unlock();
        break;
    }

    SessionID *ptr_sessionid = it_sessionids->second;

    time_t secondsnow = time (NULL);

    ptr_sessionid->MUTEX_all_session.lock_reading();
    time_t lastaccesstime = ptr_sessionid->last_access_time;
    size_t total_showed = ptr_sessionid->map_showed.size(); 
    ptr_sessionid->MUTEX_all_session.unlock();


    if ( lastaccesstime and secondsnow - lastaccesstime > LOCALSESSION_LIFETIME_SEC ) // lifetime end!
    {
        // delete session from map
        map_sessionids.erase( it_sessionids++ ); // Increments the iterator but returns the original value for use by erase
        MUTEX_map_sessionids.unlock();              


        list_to_clean.push_back ( ptr_sessionid ); // at the end
    }
    else if ( total_showed == 0 and secondsnow - lastaccesstime > 36000 ) // not active for N secontes
    {
        map_sessionids.erase( it_sessionids++ ); // Increments the iterator but returns the original value for use by erase
        MUTEX_map_sessionids.unlock();

        // add pointer to list to delete it latter
        list_to_clean.push_back ( ptr_sessionid ); // at the end            
    }
    else
    {
        ++it_sessionids; // next
        MUTEX_map_sessionids.unlock();              
    }

}

// used? pause
if ( !list_to_clean.empty() ) 
{
    //sleep(1);
}

// cleanup session deleted from working map
while ( !list_to_clean.empty() )
{
    SessionID *ptr_sessionid_to_delete = list_to_clean.front();
    list_to_clean.pop_front();

    ptr_sessionid_to_delete->MUTEX_all_session.lock_writing(); // protected lock session mutex. can not delete session if its already locked. (additational protection)
    ptr_sessionid_to_delete->cleanup();
    delete ptr_sessionid_to_delete;
}

}

NOTE as you can see on every itteration I lock/unlock map_sessions, because other threads at this time find/insert new sessions and its critical, because users can not wait.

Note that any modification of a map may invalidate any iterator in that map. You have:

MUTEX_map_sessionids.lock_reading();
std::map<unsigned int, SessionID*>::iterator it_sessionids = map_sessionids.begin();
MUTEX_map_sessionids.unlock();

Now immediately after this unlock, some other thread might acquire the lock and do some operation that invalidates it_sessionids , which will leave it in a state where the subsequent code will corrupt the map, leading to a later crash.

You need to acquire AND HOLD a lock over the entire lifetime of the iterator. It looks like you have reader/writer locks, so you need only hold a read lock for the whole time, upgrading it to a write lock when you want to modify the map and then immediately downgrading it back to a read lock after the modification. Holding a read lock for a long time will only block other threads that want to acquire a write lock, not other threads that only need a read lock.

Answering your questions in the comment:

  1. If you cannot hold the lock for an extended period of time, then you can't have iterators that remain valid for an extended period of time. One thing you can do is occasionally remember approximately where you are in the map, release the lock (giving other threads a chance and invalidating the iterator), then reaquire the lock and create a new iterator at approximately the same point. You could add something like this in the middle of your loop:

     if (++count > limit) { // only do this every Nth iteration unsigned int now_at = it_sessionids->first; MUTEX_map_sessionids.unlock(); // give others a chance MUTEX_map_sessionids.lock_reading(); it_sessionids = map_sessionids.lower_bound(now_at); count = 0; } 
  2. Upgrading a lock from read only to read/write is a primitive operation on reader/writer locks that your implementation may not support. If not then you're out of luck and need to hold a writer lock the entire time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM