简体   繁体   中英

C++ string parser issues

Ok, so I'm working on a homework project in C++ and am running into an issue, and can't seem to find a way around it. The function is supposed to break an input string at user-defined delimiters and store the substrings in a vector to be accessed later. I think I got the basic parser figured out, but it doesn't want to split the last part of the input.

int main() {
    string input =  "comma-delim-delim&delim-delim";
    vector<string> result;
    vector<char> delims;
    delims.push_back('-');
    delims.push_back('&');
    int begin = 0;

    for (int i = begin; i < input.length(); i++ ){
       for(int j = 0; j < delims.size(); j++){
          if(input.at(i) == delims.at(j)){
           //Compares chars in delim vector to current char in string, and 
           //creates a substring from the beginning to the current position 
           //minus 1, to account for the current char being a delimiter.
              string subString = input.substr(begin, (i - begin));
              result.push_back(subString);
              begin = i + 1;
           }

The above code works fine for splitting the input code up until the last dash. Anything after that, because it doesn't run into another delimiter, it won't save as a substring and push into the result vector. So in an attempt to rectify the matter, I put together the following:

else if(input.at(i) == input.at(input.length())){
   string subString = input.substr(begin, (input.length() - begin));
   result.push_back(subString);
}

However, I keep getting out of bounds errors with the above portion. It seems to be having an issue with the boundaries for splitting the substring, and I can't figure out how to get around it. Any help?

In your code you have to remember that .size() is going to be 1 more than your last index because it starts at 0. so an array of size 1 is indexed at [0]. so if you do input.at(input.length()) will always overflow by 1 place. input.at(input.length()-1) is the last element. here is an example that is working for me. After your loops just grab the last piece of the string.

if(begin != input.length()){
    string subString = input.substr(begin,(input.length()-begin));
    result.push_back(subString);
}

Working from the code in the question I've substituted iterators so that we can check for the end() of the input:

int main() {
    string input = "comma-delim-delim&delim-delim";
    vector<string> result;
    vector<char> delims;
    delims.push_back('-');
    delims.push_back('&');
    auto begin = input.begin(); // use iterator

    for(auto ii = input.begin(); ii <= input.end(); ii++){
        for(auto j : delims) {
            if(ii == input.end() || *ii == j){
                string subString(begin,ii); // can construct string from iterators, of if ii is at end
                result.push_back(subString);
                if(ii != input.end())
                    begin = ii + 1;
                else
                    goto done;
            }
        }
    }
done:
    return 0;
}

This program uses std::find_first_of to parse the multiple delimiters:

int main() {
    string input = "comma-delim-delim&delim-delim";
    vector<string> result;
    vector<char> delims;
    delims.push_back('-');
    delims.push_back('&');
    auto begin = input.begin(); // use iterator

    for(;;) {
        auto next = find_first_of(begin, input.end(), delims.begin(), delims.end());
        string subString(begin, next); // can construct string from iterators
        result.push_back(subString);
        if(next == input.end())
            break;
        begin = next + 1;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM