简体   繁体   English

C ++:矢量元素分配导致访问冲突

[英]C++: Vector element assignment causing Access violation

When breaking up a string within C++ into individual lines and parameters (as a 2D vector) it can create the interesting problem of access violations when attempting to parse the vectors between functions. 将C ++中的字符串分解为单独的行和参数(作为2D向量)时,在尝试解析函数之间的向量时,可能会引起有趣的访问冲突问题。 Within the code example, there have been many attempts to ensure that data is passed to and from the functions are independent objects and in no way a reference. 在代码示例内,已经进行了许多尝试来确保数据与函数之间的传递是独立的对象,绝不是引用。

segregate.hpp segregate.hpp

#pragma once

#include <vector>
#include <string>

/*
  Purpose:
    To to take a whole file as a string,
    and break it up into individual words
*/

namespace Segregate {
  // Module types
  typedef std::vector< std::string > ParamArray;
  struct StrCommand{
    unsigned long line;
    ParamArray param;
  };
  typedef std::vector< StrCommand > StrCommands;

  bool IsParamBreak(char val);
  bool IsLineBreak(char val);

  ParamArray Parameterize(std::string str);
  StrCommands Fragment(std::string str);
}


#include "./segregate.cpp"

segregate.cpp segregate.cpp

#include "./segregate.hpp"

namespace Segregate{
    bool IsParamBreak(char val){
        if (val == ' '){
            return true;
        }else if (val == '\t'){
            return true;
        }

        return false;
    };
    bool IsLineBreak(char val){
        if (val == '\n'){
            return true;
        }

        return false;
    };

    // Splits a single line into individual parameters
    ParamArray Parameterize(std::string str){
        str.append(" "); // Ensures that the loop will cover all segments
        unsigned long length = str.size();
        unsigned long comStart = 0;
        ParamArray res;


        // Ignore carrage returns
        //  Windows artifact
        if (str[0] == '\r'){
            comStart = 1;
        }


        // Ignore indentation
        //  Find the start of actual content
        while (comStart < length && IsParamBreak(str[comStart])){
            comStart++;
        }

        // Count the number of parameters
        unsigned long vecLen = 0;
        for (unsigned long i=comStart; i<length; i++){
            if ( IsParamBreak(str[i]) ){
                vecLen++;
            }
        }
        res.reserve(vecLen);


        // Scan will fail if there is no data
        if (length == 0){
            return res;
        }


        // Slice the the string into parts
        unsigned long toIndex = 0;
        unsigned long cursor = comStart;
        for (unsigned long i=cursor; i<length; i++){
            if (IsParamBreak(str[i]) == true){
                // Transfer the sub-string to the vector,
                //  Ensure that the data is it's own, and not a reference
                res[toIndex].reserve(i-cursor);

                // Error here
                res[toIndex].assign( str.substr(cursor, i-cursor) );

                cursor = i+1;
                toIndex++;
            }
        }

        return res;
    };


    StrCommands Fragment(std::string str){
        str.append("\n"); // Ensures that the loop will cover all segments
        unsigned long length = str.size();

        // Result
        StrCommands res;


        // Count lines
        //  Ignoring empty lines
        unsigned long vecLen = 1;
        for (unsigned long i=0; i<length; i++){
            if (IsLineBreak(str[i])){
                vecLen++;
            }
        }
        res.reserve(vecLen);


        // Ignore 'empty' strings as they may cause errors
        if (vecLen == 0){
            return res;
        }


        // Read lines
        unsigned long toIndex = 0;
        unsigned long cursor = 0;
        for (unsigned long i=0; i<length; i++){
            if (IsLineBreak(str[i])){

                // Error here
                res[toIndex].param = ParamArray(  Parameterize( std::string(str.substr(cursor, i-cursor)) )  );
                res[toIndex].line = i+1;

                // Ignore blank lines
                if (res[toIndex].param.size() == 0){
                    vecLen--;
                }else{
                    toIndex++;
                }
                cursor = i+1;
            }
        }


        // Shrink the result due to undersizing for blank lines
        res.reserve(vecLen);

        return res;
    };
}

Memory access violations normally occur on lines 66 & 108 (when the element data is stored locally within a vector). 内存访问冲突通常发生在第66和108行(当元素数据本地存储在向量中时)。 It appears to occur during the assignment phase, as deduced by using an intermidiate temporary variable to store the result directly after it's parsing. 它似乎是在赋值阶段发生的,这是通过在解析之后使用一个中间临时变量直接存储结果来推断的。 The error can also occur during vector::reserve(), tho that occurs less often. 该错误也可能在vector :: reserve()期间发生,但发生频率较低。


Note: On Windows there is no direct error message: 注意:在Windows上,没有直接的错误消息:

Exception thrown at 0x00A20462 in fiber.exe: 0xC0000005: Access violation reading location 0xBAADF009. 在fibre.exe中的0x00A20462处引发了异常:0xC0000005:访问冲突读取位置0xBAADF009。

is only seen when using 'C/C++ Extension for Visual Studio Code' debugging, not within normal terminal execution. 仅在使用“ Visual Studio Code的C / C ++扩展”调试时才能看到,而不是在常规终端执行中看到。
However on Ubuntu it outputs: 但是在Ubuntu上它输出:

Segmenation fault (core dump) 分段错误(核心转储)

You are calling reserve on your vector which allocates memory to store your objects but doesn't construct them. 您正在调用vector上的reserve ,该vector分配内存来存储对象但不构造它们。 When you then try and use the methods of the objects which you haven't constructed it's likely to crash. 然后,当您尝试使用尚未构造的对象的方法时,很可能会崩溃。

There are 2 possible solutions, either call resize instead of reserve or call push_back to construct new objects at the end of the vector. 有两种可能的解决方案,即调用resize而不是reserve或调用push_back在向量的末尾构造新对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM