简体   繁体   中英

stringstream overflow at 4GB

I'm having trouble getting beyond 4GB limitation for stringstream, even though it is running on a 64bit linux box with enough memory.

The test code below (revised after reading your comments) core dump after 4GB.

From the gdb trace, stringstream uses the default std::char_traits where int_type is set to 32bit int, rather than 64bit size_t. Any suggestion to work around?

#include <stdint.h>
#include <iostream>
#include <sstream>
#include <math.h>
using namespace std;

int main(int narg, char** argv)
{
    string str;
    stringstream ss;
    const size_t GB = (size_t)pow(2, 30);

    str.resize(GB, '1');
    cerr << "1GB=" << str.size()
             << ", string::max_size=" << str.max_size()/GB << "GB"
             << ", sizeof(int)=" << sizeof(int)
             << ", sizeof(int64_t)=" << sizeof(int64_t)
             << ", sizeof(size_t)=" << sizeof(size_t)
             << endl;
    string().swap(str);

    str.resize(6*GB, '6');
    cerr << "str.size()=" << (str.size() / GB) << "GB allocated successfully"
            << ", ended with " << str.substr(str.size()-5, 5) << endl;
    string().swap(str);

    str.resize(GB/4, 'Q');
    cerr << "writing to stringstream..." << std::flush;
    for (int i = 0; i < 30; ++i) {
        ss << str << endl;
        cerr << double(ss.str().size())/GB << "GB " << std::flush;
    }
    cerr << endl;
    exit(0);
}

The output is:

1GB=1073741824, string::max_size=4294967295GB, sizeof(int)=4, sizeof(int64_t)=8, sizeof(size_t)=8
str.size()=6GB allocated successfully, ended with 66666
writing to stringstream...0.25GB 0.5GB 0.75GB 1GB 1.25GB 1.5GB 1.75GB 2GB 2.25GB 2.5GB 2.75GB 3GB 3.25GB 3.5GB 3.75GB Segmentation fault (core dumped)

The gdb stack trace is:

(gdb) where
#0  0x00002aaaaad5e0c1 in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow(int) () from /usr/lib64/libstdc++.so.6
#1  0x00002aaaaad62cbd in std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long) () from /usr/lib64/libstdc++.so.6
#2  0x00002aaaaad5657d in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <char, std::char_traits<char>, std::allocator<char> >(std::basic_ostream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /usr/lib64/libstdc++.so.6
#3  0x000000000040112b in main ()

The binary seems to be 64bits.

$ file a.out
a.out: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped

Nick, good point on checking the preprocessor output. Somehow std::char_straits overwrites __gnu_cxx::char_straits, and redefines int_type to be int, rather than unsigned long as in __gnu_cxx::char_straits. This is very surprising!

namespace __gnu_cxx
{
# 61 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3
  template <class _CharT>
    struct _Char_types
    {
      typedef unsigned long int_type;
      typedef std::streampos pos_type;
      typedef std::streamoff off_type;
      typedef std::mbstate_t state_type;
    };
# 86 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3
  template<typename _CharT>
    struct char_traits
    {
      typedef _CharT char_type;
      typedef typename _Char_types<_CharT>::int_type int_type;
      typedef typename _Char_types<_CharT>::pos_type pos_type;
      typedef typename _Char_types<_CharT>::off_type off_type;
      typedef typename _Char_types<_CharT>::state_type state_type;
              ....
               };

namespace std
{
# 224 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3
  template<class _CharT>
    struct char_traits : public __gnu_cxx::char_traits<_CharT>
    { };



  template<>
    struct char_traits<char>
    {
      typedef char char_type;
      typedef int int_type;
      typedef streampos pos_type;
      typedef streamoff off_type;
      typedef mbstate_t state_type;
              ...
              };

EDIT: below from the STL website http://www.cplusplus.com/reference/string/char_traits/ where char_traits specialization uses int.

typedef INT_T int_type; 
Where INT_T is a type that can represent all the valid characters representable by a    char_type plus an end-of-file value (eof) which is compatible with iostream class member functions.
For char_traits<char> this is int, and for char_traits<wchar_t> this is wint_t 

Can you post a preprocessed source file.

Reference to _Char_types removed because it was misleading.

I'm not sure the int_type is related, the specialisation for char does indeed use int for int_type.

int_type is a typedef to hold an individual character (ie at least one byte for ascii, higher for wchar_t). It is not used to store a range of characters (see std::streamoff below).


From your preprocessed output I suspect that std::streamoff could be the culprit. From the headers on my system:

# 90 "/usr/include/c++/4.6/bits/postypes.h" 3
  typedef long streamoff;

std::streamoff is used in std::streampos which if defined incorrectly I think would lead to the stringstream calling the overflow function.

Have you checked that the correct headers are included?

# 61 "/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/bits/char_traits.h" 3

Is the following path:

/usr/include/c++/4.1.2/bits/char_traits.h

-nick

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM