简体   繁体   English

std :: less中的分段错误<char>

[英]Segmentation fault in std::less<char>

I have the following code (C++0x): 我有以下代码(C ++ 0x):

const set<char> s_special_characters =  { '(', ')', '{', '}', ':' };

void nectar_loader::tokenize( string &line, const set<char> &special_characters )
{
    auto it = line.begin();
    const auto not_found = special_characters.end();

    // first character special case
    if( it != line.end() && special_characters.find( *it ) != not_found )
        it = line.insert( it+1, ' ' ) + 1;

    while( it != line.end() )
    {
        // check if we're dealing with a special character
        if( special_characters.find(*it) != not_found ) // <----------
        {
            // ensure a space before
            if( *(it-1) != ' ' )
                it = line.insert( it, ' ' ) + 1;
            // ensure a space after
            if( (it+1) != line.end() && *(it+1) != ' ' )
                it = line.insert( it+1, ' ');
            else
                line.append(" ");
        }
        ++it;
    }
}

with the crash pointing at the indicated line. 崩溃指向指示的线。 This results in a segfault with this gdb backtrace: 这导致与此gdb回溯的段错误:

#0  0x000000000040f043 in std::less<char>::operator() (this=0x622a40, __x=@0x623610, __y=@0x644000)
    at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_function.h:230
#1  0x000000000040efa6 in std::_Rb_tree<char, char, std::_Identity<char>, std::less<char>, std::allocator<char> >::_M_lower_bound (this=0x622a40, __x=0x6235f0, __y=0x622a48, __k=@0x644000)
    at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1020
#2  0x000000000040e840 in std::_Rb_tree<char, char, std::_Identity<char>, std::less<char>, std::allocator<char> >::find (this=0x622a40, __k=@0x644000)
    at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_tree.h:1532
#3  0x000000000040e4fd in std::set<char, std::less<char>, std::allocator<char> >::find (this=0x622a40, __x=@0x644000)
    at /usr/lib/gcc/x86_64-unknown-linux-gnu/4.5.2/../../../../include/c++/4.5.2/bits/stl_set.h:589
#4  0x000000000040de51 in ambrosia::nectar_loader::tokenize (this=0x7fffffffe3b0, line=..., special_characters=...)
    at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:146
#5  0x000000000040dbf5 in ambrosia::nectar_loader::fetch_line (this=0x7fffffffe3b0)
    at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:112
#6  0x000000000040dd11 in ambrosia::nectar_loader::fetch_token (this=0x7fffffffe3b0, token=...)
    at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:121
#7  0x000000000040d9c4 in ambrosia::nectar_loader::next_token (this=0x7fffffffe3b0)
    at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:72
#8  0x000000000040e472 in ambrosia::nectar_loader::extract_nectar<std::back_insert_iterator<std::vector<ambrosia::target> > > (this=0x7fffffffe3b0, it=...)
    at ../../ambrosia/Library/Source/Ambrosia/nectar_loader.cpp:43
#9  0x000000000040d46d in ambrosia::drink_nectar<std::back_insert_iterator<std::vector<ambrosia::target> > > (filename=..., it=...)
    at ../../ambrosia/Library/Source/Ambrosia/nectar.cpp:75
#10 0x00000000004072ae in ambrosia::reader::event (this=0x623770)

I'm at a loss, and have no clue where I'm doing something wrong. 我很茫然,不知道我做错了什么。 Any help is much appreciated. 任何帮助深表感谢。

EDIT: the string at the moment of the crash is 编辑:崩溃时的字符串是

sub Ambrosia : lib libAmbrosia sub Ambrosia:lib libAmbrosia

UPDATE: 更新:

I replaced the above function following suggestions in comments/answers. 我根据评论/答案中的建议替换了上述功能。 Below is the result. 结果如下。

const string tokenize( const string &line, const set<char> &special_characters )
{
    const auto not_found = special_characters.end();
    const auto end = line.end();
    string result;

    if( !line.empty() )
    {
        // copy first character
        result += line[0];

        char previous = line[0];
        for( auto it = line.begin()+1; it != end; ++it )
        {
            const char current = *it;

            if( special_characters.find(previous) != not_found )
                result += ' ';

            result += current;
            previous = current;
        }
    }
    return result;
}

另一个猜测是line.append(" ")有时会使it无效,具体取决于线路的原始容量。

在第一次取消引用it之前,不要检查it != line.end()

I could not spot the error, I would suggest iterating slowly with the debugger since you have identitied the issue. 我无法发现错误,我建议您使用调试器慢慢迭代,因为您已经确定了问题。

I'll just that in general, modifying what you are iterating over is extremely prone to failure. 我只是这样,修改你正在迭代的内容非常容易失败。

I'd recommend using Boost Tokenizer , and more precisely: boost::token_iterator combined with boost::char_separator (code example included). 我建议使用Boost Tokenizer ,更准确地说: boost::token_iteratorboost::char_separator结合使用(包括代码示例)。

You could then simply build a new string from the first, and return the new string from the function. 然后,您可以简单地从第一个构建一个新string ,并从该函数返回新字符串。 The speed up on computation should cover the memory allocation. 计算速度应该包括内存分配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM