简体   繁体   中英

C++ unordered_map emplace() function throwing seg fault and I have no idea why

I am getting a seg fault while using std::unordered_map::emplace() . Here is the minimal reproducible example:

#include <iostream>
#include <string>
#include <unordered_map>
using namespace std;

class WordTable {
public:
  WordTable() {
    total = 0;
  }
  ~WordTable() {}

  void addWord(const string word, const int incr = 1) {
    cout << "begin emplace" << endl;
    table.emplace(word, Node()); //this is where the seg fault occurs
    cout << "emplace succeeded" << endl;
    if (incr) {
      table[word].incrementCount();
      incrementTotal();
    }
  }
private:
  struct Node {
  public:
    Node() {
      count = 0;
      kids = new WordTable();
    }
    ~Node() {
      delete kids;
    }
    int incrementCount() {
      return ++count;
    }
  private:
    int count;
    WordTable* kids;
  };

  int incrementTotal() {
    return ++total;
  }
  int total;
  unordered_map<string, Node> table;
};

int main() {
  WordTable tmp;
  cout << "add word 1" << endl;
  tmp.addWord("Hi");
  cout << "end add word 1" << endl;
  tmp.addWord("Hi");
  cout << "end add word 2" << endl;

  WordTable* test = new WordTable();
  cout << "add word 3" << endl;
  test->addWord("Hi");
  cout << "end add word 3" << endl;
}

And the corresponding output:

add word 1
begin emplace
emplace succeeded
end add word 1
begin emplace
emplace succeeded
end add word 2
add word 3
begin emplace
Segmentation fault (core dumped)

The seg fault happens in the call to .emplace() in the third call to addWord() .

What is supposed to happen is that addWord("Hi") adds a mapping to the std::unordered_map<std::string, Node> table. That mapping should have "Hi" as the key value and a Node() object for the mapped value.

Here's the first weird part: if I only have one call to addWord() before that third call, there is no seg fault. Here's the output:

add word 1
begin emplace
emplace succeeded
end add word 1
end add word 2
add word 3
begin emplace
emplace succeeded
end add word 3

The second weird part is that if I statically allocate test , then there is also no seg fault. Here's the output from that:

add word 1
begin emplace
emplace succeeded
end add word 1
begin emplace
emplace succeeded
end add word 2
add word 3
begin emplace
emplace succeeded
end add word 3

I have no idea what's happening nor why it's happening. I simply don't understand how a seg fault could occur inside the STL unordered_map::emplace() . The only issue I can think of is the way I am creating a Node() in addWord() , but I don't see how that would enable addWord() to succeed in the first two calls but seg fault in the third.

I would greatly appreciate any assistance!!!

In Node , you allocate and free WordTable *kids in the constructor and destructor, but it will have the default copy constructor and operator. These will just copy the pointer itself, not create a new object, like:

Node(const Node &cp) // default
    : count(cp.count), kids(cp.kids) // no "new"!
{}

When the first of these copies is destructed, the pointer is deleted, leaving the other copies with an invalid pointer, which will hopefully crash on access (the alternative is often some form of heap corruption). In this case the second access seems to vary by compiler, GCC seems to run into issues within emplace due to making extra copies, MSVC doesn't until it goes to ~WordTable() when main returns (the WordTable tmp stack variable).

You can see this by tracking the new/delete:

Node() {
  count = 0;
  kids = new WordTable();
  cout << "new kids " << kids << endl;
}
~Node() {
  cout << "delete kids " << kids << endl;
  delete kids;
}
// GCC
add word 1
begin emplace
new kids 0xa38c30
delete kids 0xa38c30 // was deleted inside emplace, but when `~WordTable` happens later (if it got that far) will be deleted a second time, the one stored in the WordTable instance.
emplace succeeded
end add word 1
begin emplace
new kids 0xa38c30 // Can get the same pointer as 0xa38c30 is now "free"
delete kids 0xa38c30 // and again
delete kids 0xa38c30 // this time twice due to some GCC specific implementation detail, when the value "Hi" already existed so the value was no longer wanted
emplace succeeded
end add word 2
add word 3
begin emplace
new kids 0xa38cf0 // same pointer again
SIGSEGV // this time not so lucky, maybe because that double deletion just above corrupted something

You can prevent the default copy by "deleting" the constructor, operator:

Node(const Node &) = delete;
Node &operator = (const Node &) = delete;

This will turn table.emplace(word, Node()); into a compile error, as this is where the copy is happening. Although you called emplace , you passed it a full temporary object, so it will try and emplace to the copy constructor Node(const Node &) . What you want to do with emplace is pass it the constructor arguments, this is a bit tricky for the default constructor case, a simple table.emplace(word) won't compile:

table.emplace(std::piecewise_construct, std::make_tuple(word), std::make_tuple());

Alternatively if you want your object to be safely copyable, implement those functions explicitly.

Node(const Node &cp)
    : count(cp.count), kids(new WordTable()) // create a full new object this time
{
    *kids = *cp.kids;
    cout << "copy constructor " << kids << endl;
}
Node &operator = (const Node &cp)
{
    *kids = *cp.kids;
    cout << "copy operator " << kids << endl;
    return *this;
}
add word 1
begin emplace
new kids 0xee8c30
copy constructor 0xee8cd0 // this time made a new object
delete kids 0xee8c30 // deleted the original but 0xee8cd0 is still valid
emplace succeeded
end add word 1
begin emplace
new kids 0xee8c30
copy constructor 0xee8d90
delete kids 0xee8d90
delete kids 0xee8c30
emplace succeeded
end add word 2
add word 3
begin emplace
new kids 0xee8d40
copy constructor 0xee8de0
delete kids 0xee8d40
emplace succeeded
end add word 3
delete kids 0xee8cd0 // that first copy getting deleted when main returns

The copy of the WordTable is fine, as unordered_map<string, Node> will copy each key/value individually, using the ones just provided.

Another similar alternative would be to provide suitable move constructors and operators, either along side the copy ones, or with the copy deleted.

Node(const Node &cp) = delete;
Node &operator = (const Node &cp) = delete;
Node(Node && mv)
    : count(mv.count), kids(mv.kids)
{
    mv.kids = nullptr; // took kids away from mv
    cout << "move constructor " << kids << endl;
}
Node &operator = (Node &&mv)
{
    swap(count, mv.count);
    swap(kids, mv.kids);
    cout << "move operator " << kids << " from " << mv.kids << endl;
    return *this;
}
add word 1
begin emplace
new kids 0x1c4cc30 // temporary object
move constructor 0x1c4cc30 // final emplace value
delete kids 0 // moved it out, so nothing is deleted here
emplace succeeded
end add word 1
begin emplace
new kids 0x1c4ccf0
move constructor 0x1c4ccf0
delete kids 0x1c4ccf0  // delete due to duplicate "Hi"
delete kids 0 // again it was moved, so empty
emplace succeeded
end add word 2
add word 3
begin emplace
new kids 0x1c4ccf0
move constructor 0x1c4ccf0
delete kids 0
emplace succeeded
end add word 3
delete kids 0x1c4cc30

Remember that whatever state you leave the moved object in (eg here zero count and null kids ) needs to itself be valid. So you would need to be careful and have appropriate if (kids == nullptr) checks if you did that.


A case like this is also a good one for a std::unique_ptr , where some object is being created and destroyed within a single unique place, saving the need for a manual delete . It will also automatically prevent the default copy, as unique_ptr itself disallows copying, but allows moving (Note: If you have an a ~Node() , you won't get the move functions automatically).

struct Node {
public:
    Node()
        : count(0)
        , kids(std::make_unique<WordTable>()) // std::unique_ptr(new WordTable())
    {}
    int incrementCount() {
        return ++count;
    }
private:
    int count;
    std::unique_ptr<WordTable> kids;
};

If we compile and run with Valgrind, we see some compiler warnings that hint to the problem:

g++ -std=c++2a -fPIC -g -Wall -Wextra -Wwrite-strings -Wno-parentheses -Wpedantic -Warray-bounds  -Weffc++       59351752.cpp    -o 59351752
59351752.cpp:23:10: warning: ‘struct WordTable::Node’ has pointer data members [-Weffc++]
   23 |   struct Node {
      |          ^~~~
59351752.cpp:23:10: warning:   but does not override ‘WordTable::Node(const WordTable::Node&)’ [-Weffc++]
59351752.cpp:23:10: warning:   or ‘operator=(const WordTable::Node&)’ [-Weffc++]

and and indication of our first use-after-free:

valgrind --leak-check=full ./59351752   
==1137125== Memcheck, a memory error detector
==1137125== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1137125== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1137125== Command: ./59351752
==1137125== 
add word 1
begin emplace
emplace succeeded
end add word 1
begin emplace
==1137125== Invalid read of size 8
==1137125==    at 0x10B3D8: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_begin() const (hashtable.h:384)
==1137125==    by 0x10AFD1: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::clear() (hashtable.h:2028)
==1137125==    by 0x10ACCD: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1352)
==1137125==    by 0x10A905: std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, WordTable::Node, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> > >::~unordered_map() (unordered_map.h:102)
==1137125==    by 0x10A94F: WordTable::~WordTable() (59351752.cpp:11)
==1137125==    by 0x10AAA3: WordTable::Node::~Node() (59351752.cpp:30)
==1137125==    by 0x10A9C5: WordTable::addWord(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) (59351752.cpp:15)
==1137125==    by 0x10A3B2: main (59351752.cpp:52)
==1137125==  Address 0x4d7d228 is 24 bytes inside a block of size 64 free'd
==1137125==    at 0x483708B: operator delete(void*, unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1137125==    by 0x10AAB0: WordTable::Node::~Node() (59351752.cpp:30)
==1137125==    by 0x10CBD9: std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>::~pair() (stl_pair.h:208)
==1137125==    by 0x10CC05: void __gnu_cxx::new_allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>*) (new_allocator.h:153)
==1137125==    by 0x10C58D: void std::allocator_traits<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >(std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>*) (alloc_traits.h:497)
==1137125==    by 0x10BC32: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, true>*) (hashtable_policy.h:2102)
==1137125==    by 0x10B548: std::pair<std::__detail::_Node_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, false, true>, bool> std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node>(std::integral_constant<bool, true>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node&&) (hashtable.h:1655)
==1137125==    by 0x10B0B2: std::pair<std::__detail::_Node_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, false, true>, bool> std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node&&) (hashtable.h:749)
==1137125==    by 0x10AD2D: std::pair<std::__detail::_Node_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node>, false, true>, bool> std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, WordTable::Node, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, WordTable::Node> > >::emplace<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, WordTable::Node&&) (unordered_map.h:388)
==1137125==    by 0x10A9B9: WordTable::addWord(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) (59351752.cpp:15)
==1137125==    by 0x10A3B2: main (59351752.cpp:52)
==1137125==  Block was alloc'd at
==1137125==    at 0x4835DEF: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1137125==    by 0x10AA66: WordTable::Node::Node() (59351752.cpp:27)
==1137125==    by 0x10A9A6: WordTable::addWord(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int) (59351752.cpp:15)
==1137125==    by 0x10A3B2: main (59351752.cpp:52)
==1137125== 

This is clearly caused by the raw pointer in Node that is copied in the compiler-generated copy constructor.

If we simply change Node::kids to be a std::unique_ptr and delete test before leaving main() , then we get a clean Valgrind run.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM