简体   繁体   English

在 C++ 中解析逗号分隔的整数/整数范围

[英]Parse comma-separated ints/int-ranges in C++

Given a string in C++ containing ranges and single numbers of the kind:给定 C++ 中的字符串,其中包含范围和单个数字:

"2,3,4,7-9"

I want to parse it into a vector of the form:我想将其解析为以下形式的向量:

2,3,4,7,8,9

If the numbers are separated by a - then I want to push all of the numbers in the range.如果数字由-分隔,那么我想推送该范围内的所有数字。 Otherwise I want to push a single number.否则我想推一个数字。

I tried using this piece of code:我尝试使用这段代码:

const char *NumX = "2,3,4-7";
std::vector<int> inputs;
std::istringstream in( NumX );
std::copy( std::istream_iterator<int>( in ), std::istream_iterator<int>(),
           std::back_inserter( inputs ) );

The problem was that it did not work for the ranges.问题是它不适用于范围。 It only took the numbers in the string, not all of the numbers in the range.它只取字符串中的数字,而不是范围内的所有数字。

Your problem consists of two separate problems:您的问题由两个独立的问题组成:

  1. splitting the string into multiple strings at ,在 处将字符串拆分为多个字符串,
  2. adding either numbers or ranges of numbers to a vector when parsing each string解析每个字符串时将数字或数字范围添加到向量

If you first split the whole string at a comma, you won't have to worry about splitting it at a hyphen at the same time.如果你先用逗号分割整个字符串,你就不必担心同时用连字符分割它。 This is what you would call a Divide-and-Conquer approach.这就是你所说的分而治之的方法。

Splitting at ,分裂在,

This question should tell you how you can split the string at a comma. 这个问题应该告诉你如何用逗号分割字符串。

Parsing and Adding to std::vector<int>解析和添加到std::vector<int>

Once you have the split the string at a comma, you just need to turn ranges into individual numbers by calling this function for each string:一旦你用逗号分割字符串,你只需要通过为每个字符串调用这个 function 来将范围转换为单独的数字:

#include <vector>
#include <string>

void push_range_or_number(const std::string &str, std::vector<int> &out) {
    size_t hyphen_index;
    // stoi will store the index of the first non-digit in hyphen_index.
    int first = std::stoi(str, &hyphen_index);
    out.push_back(first);

    // If the hyphen_index is the equal to the length of the string,
    // there is no other number.
    // Otherwise, we parse the second number here:
    if (hyphen_index != str.size()) {
        int second = std::stoi(str.substr(hyphen_index + 1), &hyphen_index);
        for (int i = first + 1; i <= second; ++i) {
            out.push_back(i);
        }
    }
}

Note that splitting at a hyphen is much simpler because we know there can be at most one hyphen in the string.请注意,在连字符处拆分要简单得多,因为我们知道字符串中最多可以有一个连字符。std::string::substr is the easiest way of doing it in this case.在这种情况下, std::string::substr是最简单的方法。 Be aware that std::stoi can throw an exception if the integer is too large to fit into an int .请注意,如果 integer 太大而无法放入int ,则std::stoi可能会引发异常。

Apart from @J.除了@J。 Schultke's excellent example, I suggest the use of regexes in the following way: Schultke 的优秀示例,我建议通过以下方式使用正则表达式:

#include <algorithm>
#include <iostream>
#include <regex>
#include <string>
#include <vector>

void process(std::string str, std::vector<int>& num_vec) {
    str.erase(--str.end());
    for (int i = str.front() - '0'; i <= str.back() - '0'; i++) {
        num_vec.push_back(i);                                                     
    }
}

int main() {
    std::string str("1,2,3,5-6,7,8");
    str += "#";
    std::regex vec_of_blocks(".*?\,|.*?\#");
    auto blocks_begin = std::sregex_iterator(str.begin(), str.end(), vec_of_blocks);
    auto blocks_end = std::sregex_iterator();
    std::vector<int> vec_of_numbers;
    for (std::sregex_iterator regex_it = blocks_begin; regex_it != blocks_end; regex_it++) {
        std::smatch match = *regex_it;
        std::string block = match.str();
        if (std::find(block.begin(), block.end(), '-') != block.end()) {
            process(block, vec_of_numbers);
        }
        else {
            vec_of_numbers.push_back(std::atoi(block.c_str()));
        }
    }
    return 0;
}

Of course, you still need a tad bit validation, however, this will get you started.当然,你仍然需要一点点验证,但是,这会让你开始。

All very nice solutions so far.到目前为止所有非常好的解决方案。 Using modern C++ and regex, you can do an all-in-one solution with only very few lines of code.使用现代 C++ 和正则表达式,您只需很少几行代码即可完成一体化解决方案。

How?如何? First, we define a regex that either matches an integer OR an integer range.首先,我们定义一个匹配 integer 或 integer 范围的正则表达式。 It will look like this它看起来像这样

((\d+)-(\d+))|(\d+)

Really very simple.真的很简单。 First the range.先说范围。 So, some digits, followed by a hyphen and some more digits.所以,一些数字,后跟一个连字符和更多数字。 Then the plain integer: Some digits.然后是普通的 integer:一些数字。 All digits are put in groups.所有数字都分组。 (braces). (大括号)。 The hyphen is not in a matching group.连字符不在匹配组中。

This is all so easy that no further explanation is needed.这一切都如此简单,无需进一步解释。

Then we call std::regex_search in a loop, until all matches are found.然后我们在循环中调用std::regex_search ,直到找到所有匹配项。

For each match, we check, if there are sub-matches, meaning a range.对于每个匹配,我们检查是否有子匹配,即范围。 If we have sub-matches, a range, then we add the values between the sub-matches (inclusive) to the resulting std::vector .如果我们有子匹配,一个范围,那么我们将子匹配之间的值(包括)添加到生成的std::vector中。

If we have just a plain integer, then we add only this value.如果我们只有一个普通的 integer,那么我们只添加这个值。

All this gives a very simple and easy to understand program:所有这些都提供了一个非常简单易懂的程序:

#include <iostream>
#include <string>
#include <vector>
#include <regex>

const std::string test{ "2,3,4,7-9" };

const std::regex re{ R"(((\d+)-(\d+))|(\d+))" };
std::smatch sm{};

int main() {
    // Here we will store the resulting data
    std::vector<int> data{};

    // Search all occureences of integers OR ranges
    for (std::string s{ test }; std::regex_search(s, sm, re); s = sm.suffix()) {

        // We found something. Was it a range?
        if (sm[1].str().length())

            // Yes, range, add all values within to the vector  
            for (int i{ std::stoi(sm[2]) }; i <= std::stoi(sm[3]); ++i) data.push_back(i);
        else
            // No, no range, just a plain integer value. Add it to the vector
            data.push_back(std::stoi(sm[0]));
    }
    // Show result
    for (const int i : data) std::cout << i << '\n';
    return 0;
}

If you should have more questions, I am happy to answer.如果您还有更多问题,我很乐意回答。


Language: C++ 17 Compiled and tested with MS Visual Studio 19 Community Edition语言:C++ 17 使用 MS Visual Studio 19 社区版编译和测试

Consider pre-process your number string and split them.考虑预处理您的数字字符串并将它们拆分。 In the following code, transform() would convert one of the delims, , - and + , into a space so that std::istream_iterator parse int successfully.在下面的代码中, transform()会将分隔符之一、 , -+转换为空格,以便std::istream_iterator成功解析 int。

#include <cstdlib>
#include <algorithm>
#include <string>
#include <vector>
#include <iostream>
#include <sstream>

int main(void)
{
    std::string nums = "2,3,4-7,9+10";
    const std::string delim_to_convert = ",-+";  // , - and +
    std::transform(nums.cbegin(), nums.cend(), nums.begin(),
            [&delim_to_convert](char ch) {return (delim_to_convert.find(ch) != string::npos) ? ' ' : ch; });

    std::istringstream ss(nums);
    auto inputs = std::vector<int>(std::istream_iterator<int>(ss), {});

    exit(EXIT_SUCCESS);
}

Note that the code above can split only 1-byte length delims.请注意,上面的代码只能拆分 1 字节长度的分隔符。 You should refer to @d4rk4ng31 answer if you need more complex and longer delims.如果您需要更复杂和更长的分隔符,您应该参考@d4rk4ng31 答案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM