简体   繁体   中英

Error message while in C++ program

I am storing around 14GB of data in following map:

struct data
{
   char* a;
   char* b;
   data(char* _a, char* _b)
   {
        int alen = strlen(a);
        a = new char[alen+1];
        strcpy(a,_a);
        a[alen]='\0';

        int blen = strlen(_b);
        b = new char[ blen+1];
        strcpy(b,_b);
        b[blen]='\0';        
    }

    ~data()
    {
       delete [] a;
       delete [] b;
    }
};

struct ltstr
{
    bool operator()(const char* s1, const char* s2) const
    {
        return strcmp(s1, s2) < 0;
    }
};

map<const char*, data*, ltstr> m;

Program runs for certain number of records(10440440 out of 26293289) and after some time iam getting following error message.

Program terminated with signal SIGKILL, Killed.
The program no longer exists. 
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6.x86_64 libgcc- 4.4.5-6.el6.x86_64 libstdc++-4.4.5-6.el6.x86_64

how can i avoid above termination?

Server Specification:

  Mem:  24605344k total, 15148556k used,  9456788k free,    20892k buffers
  Swap:  2097144k total,   161364k used,  1935780k free, 14469244k cached

The OOM killer is probably stepping in and killing your process. Take a look in your logs to be sure.

如果不知道程序的更多详细信息,您可能会尝试对文件进行内存映射以减少内存使用量。

Since you're apparently running out of memory, the obvious thing to do is to reduce the amount of memory you're using. One possibility would be to use something like a Patricia trie to store the data instead of a std::map (which normally uses a red-black tree).

Exactly how much that'll gain will depend on just how much redundancy you have in your strings. To an extent, it'll also depend on the size of the strings you're using as keys, versus the size of the data you're storing with them.

Depending on the situation, you might also consider using something like Huffman compression with a fixed table for the strings you're storing as data associated with each key. Although Huffman compression isn't necessarily the most effective in other situations, with a fixed table it has the advantage of working reasonably well with individual strings like you're working with here. If the strings in the data part are long (eg, average of at least 8K for the two strings) it might well be worth applying an LZ* family compression in addition to the Huffman compression, but if they're short (eg, anything less than at least a few kilobytes) LZ probably won't work very well (unless you're willing to group the strings together into blocks, so you might have to decompress a few kilobytes of other strings to get to the one you care about at any given time when you're retrieving some data.

While compression will normally be slower than accessing the data directly from memory without compressing/decompressing, it will still usually be faster than retrieving data from disk, as will typically happen if you run out of physical memory and end up using virtual memory.

how can i avoid above termination?

By rethinking your design. What are you trying to accomplish? In particular, why do you need a lookup table with 27 million entries in memory?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM