简体   繁体   中英

Seg. fault with std::unique_ptr and ctor

For a parser I am actually implementing I partially have these private functions within the parser:

Parser private methods:

    Token const* current_token() const;
    Token const* next_token();
    Token const* peek_token();

    std::unique_ptr<ast::Expression> parse_expression();
    std::unique_ptr<ast::TypeSpecifier> parse_type_specifier();
    std::unique_ptr<ast::VariableDeclarationStatement> parse_variable_declaration();
    std::unique_ptr<ast::Statement> parse_function_definition();
    std::unique_ptr<ast::Statement> parse_top_level_statement();

the implementation of the parse_variable_declaration method is this:

parse_variable_declaration():

std::unique_ptr<ast::VariableDeclarationStatement> Parser::parse_variable_declaration() {
    next_token(); // consume 'var'

    if (current_token()->get_type() != TokenTypes::identifier) {
        throw parser_error(current_token(), "", "expected identifier\n");
    }
    const auto id = current_token(); // store identifier
    next_token(); // consume identifier

    std::unique_ptr<ast::TypeSpecifier> type;
    std::unique_ptr<ast::Expression> expr;

    auto assignment_required = true;
    if (current_token()->get_type() == TokenTypes::op_colon) { // optional type specifier
        next_token(); // consume ':'

        type = parse_type_specifier();
        assignment_required = false;
    }

    if (assignment_required && current_token()->get_type() != TokenTypes::op_equals) {
        throw parser_error(current_token(), "", "expected equals operator\n");
    }

    if (current_token()->get_type() == TokenTypes::op_equals) {
        next_token(); // consume '='

        expr = parse_expression();
    }

    if (current_token()->get_type() != TokenTypes::op_semi_colon) {
        throw parser_error(current_token(), "", "expected semi-colon\n");
    }

    next_token(); // consume ';'

    DEBUG_STDERR("parsed: variable_declaration_statement\n");
    return std::make_unique<ast::VariableDeclarationStatement>(
        id->get_string(), std::move(type), std::move(expr));
}

the last line (the return) ends in a segmentation fault. it basically calls the constructor of VariableDeclarationStatement:

VariableDeclarationStatement ctor:

VariableDeclarationStatement::VariableDeclarationStatement(
    std::string const& name,
    std::unique_ptr<TypeSpecifier> type_specifier,
    std::unique_ptr<Expression> expr
):
    m_name{name},
    m_type_specifier{std::move(type_specifier)},
    m_expr{std::move(expr)}
{}

I am debugging this things since yesterday and can't seem to find out why this does not work as intended. I want to build the Abstract Syntax Tree (parser output) with unique pointers to their child nodes (because they are the only owner of their childs which makes sense) - this is why I am try-harding to work with them.

Console output: DEBUG_STDERR

parsed: primitive_type_int // from parse_type_specifier()
parsed: integral_expression // from parse_expression()
parsed: variable_declaration_statement
[1]    12638 segmentation fault (core dumped)  ./cion_compiler

The move operations on unique pointers basically boil down to simple pointer copies. There is no reason why any implementation of unique_ptr would dereference the pointers in the process of moving them. Therefore, the likelihood that this operation is responsible for the seg-fault is virtually zero.

In your return-statement / constructor-call, you do have one (or more) very obvious pointer de-referencing, as part of the id->get_string() call.

For one, the id pointer is created as so:

  const Token* const id = current_token(); // store identifier
  next_token(); // consume identifier

Unless there is a guarantee that any pointer returned by current_token() will be valid until the end of time (or within the life-time of the current parsing operation), it is very possible that after the call to next_token() , the id pointer is invalid, ie, pointing to a non-existent or defunct Token object.

Even if the id pointer still points to an existing Token object, it is possible that it is in a "zombie" state, and that obtaining a string from it, through get_string() , is an invalid operation.

If I were you, that is where I would be looking for the source of the seg-fault. You might also want to run this in a (memory-)debugger to get to the source of it, it will likely point you to the get_string function as the source of it, either during the dereferencing of the this pointer (the id pointer) or during the construction of the string itself. It could also point you towards the virtual-table look-up, if get_string is a virtual function in the Token class. Either way, I highly suspect that this is the cause of the seg-fault, because it is the only overtly dangerous code in what you have posted.

As you guys correctly suggested the error was hidden in the suspicious id pointer. The parser in my program receives Tokens via unique_ptr from the lexer and stores them right as the current token. Therefore the method current_token() returned a pointer to a unique_ptr which gets removed as soon as the next call to next_token() takes place. Storing the invalid pointer to the already removed Token in id caused the problem.

I fixed the code in several different ways.

First I changed the return types from the helper methods above from "Token const*" to "Token const&" and the id variable now only copies the get_string value and does no other pointer related operations.

With these changes the segmentation fault problem was successfully solved! =)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM