简体   繁体   English

解析键,键不唯一时的值对

[英]Parse key, value pairs when key is not unique

My input are multiple key, value pairs eg: 我的输入是多个键,值对,例如:

A=1, B=2, C=3, ..., A=4

I want to parse the input into the following type: 我想将输入解析为以下类型:

 std::map< char, std::vector< int > > m

Values for equal keys shall be appended to the vector. 等号的值应附加到向量上。 So the parsed output should be equal to: 因此,解析后的输出应等于:

m['A']={1,4};
m['B']={2};
m['C']={3};

What is the simplest solution using 'boost::spirit::qi' ? 使用'boost :: spirit :: qi'最简单的解决方案是什么?

Here is one way to do it: 这是一种实现方法:

#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/vector.hpp>
#include <boost/fusion/include/at_c.hpp>
#include <iostream>
#include <utility>
#include <string>
#include <vector>
#include <map>

namespace qi = boost::spirit::qi;
namespace fusion = boost::fusion;

int main()
{
    std::string str = "A=1, B=2, C=3, A=4";

    std::map< char, std::vector< int > > m;
    auto inserter = [&m](fusion::vector< char, int > const& parsed,
        qi::unused_type, qi::unused_type)
    {
        m[fusion::at_c< 0 >(parsed)].push_back(fusion::at_c< 1 >(parsed));
    };

    auto it = str.begin(), end = str.end();
    bool res = qi::phrase_parse(it, end,
        ((qi::char_ >> '=' >> qi::int_)[inserter]) % ',',
        qi::space);

    if (res && it == end)
        std::cout << "Parsing complete" << std::endl;
    else
        std::cout << "Parsing incomplete" << std::endl;

    for (auto const& elem : m)
    {
        std::cout << "m['" << elem.first << "'] = {";
        for (auto value : elem.second)
            std::cout << " " << value;
        std::cout << " }" << std::endl;
    }

    return 0;
}

A few comments about the implementation: 关于实现的一些评论:

  1. qi::phrase_parse is a Boost.Spirit algorithm that takes a pair of iterators, a parser, and a skip parser, and runs the parsers on the input denoted by the iterators. qi::phrase_parse是Boost.Spirit算法,它采用一对迭代器,一个解析器和一个跳过解析器,并在由迭代器表示的输入上运行解析器。 In the process, it updates the beginning iterator ( it in this example) so that it points to the end of the consumed input upon return. 在这个过程中,它更新迭代开始( it在这个例子中),使其指向消耗输入返回时的结束。 The returned res value indicates whether the parsers have succeeded (ie the consumed input could be successfully parsed). 返回的res值指示解析器是否成功(即可以成功解析消耗的输入)。 There are other forms of qi::phrase_parse that allow extracting attributes (which is the parsed data, in terms of Boost.Spirit) but we're not using attributes here because you have a peculiar requirement of the resulting container structure. qi::phrase_parse还有其他形式,它们可以提取属性(就Boost.Spirit而言,是解析后的数据),但是我们这里不使用属性,因为您对生成的容器结构有特殊的要求。

  2. The skip parser is used to skip portions of the input between the elements of the main parser. 跳过解析器用于跳过主解析器元素之间的部分输入。 In this case, qi::space means that any whitespace characters will be ignored in the input, so that eg "A = 1" and "A=1" can both be parsed similarly. 在这种情况下, qi::space表示输入中将忽略任何空格字符,因此“ A = 1”和“ A = 1”都可以类似地进行解析。 There is qi::parse family of algorithms which do not have a skip parser and therefore require the main parser to handle all input without skips. qi::parse系列算法没有跳过解析器,因此需要主解析器处理所有输入而不会跳过。

  3. The (qi::char_ >> '=' >> qi::int_) part of the main parser matches a single character , followed by the equals sign character, followed by a signed integer . 主解析器的(qi::char_ >> '=' >> qi::int_)部分匹配单个字符 ,后跟等号字符,再跟有符号整数 The equals sign is expressed as a literal (ie it is equivalent to the qi::lit('=') parser), which means it only matches the input but does not result in a parsed data. 等号表示为文字(即等于qi::lit('=')解析器),这意味着它仅匹配输入,但不会生成解析的数据。 Therefore the result of this parser is an attribute that is a sequence of two elements - a character and an integer. 因此,此解析器的结果是一个属性,该属性是两个元素(一个字符和一个整数)的序列。

  4. The % ',' part of the parser is a list parser , which parses any number of pieces of input described by the parser on the left (which is the parser described above), separated by the pieces described by the parser on the right (ie with comma characters in our case). 解析器的% ','部分是一个列表解析器 ,它解析左边解析器描述的任意数量的输入(上面描述的解析器),并用右边解析器描述的各个元素分隔(即在我们的情况下使用逗号字符)。 As before, the comma character is a literal parser, so it doesn't produce output. 和以前一样,逗号字符是文字​​解析器,因此不会产生输出。

  5. The [inserter] part is a semantic action , which is a function that is called by the parser every time it matches a portion of input string. [inserter]部分是语义动作 ,它是解析器在每次与输入字符串的一部分匹配时调用的函数。 The parser passes all its parsed output as the first argument to this function. 解析器将所有解析后的输出作为此函数的第一个参数传递。 In our case the semantic action is attached to the parser described in bullet #3, which means a sequence of a character and an integer is passed. 在我们的例子中,语义动作附加在项目符号3中描述的解析器上,这意味着传递了一个字符序列和一个整数。 Boost.Spirit uses a fusion::vector to pass these data. Boost.Spirit使用fusion::vector传递这些数据。 The other two arguments of the semantic action are not used in this example and can be ignored. 语义动作的其他两个参数在此示例中未使用,可以忽略。

  6. The inserter function in this example is a lambda function, but it could be any other kind of function object, including a regular function, a function generated by std::bind , etc. The important part is that it has the specified signature and that the type of its first argument is compatible with the attribute of the parser, to which it is attached as a semantic action. 在此示例中, inserter函数是lambda函数,但它可以是任何其他类型的函数对象,包括常规函数,由std::bind生成的函数等。重要的部分是它具有指定的签名,并且其第一个参数的类型与解析器的属性兼容,该解析器作为语义动作附加到该属性。 So, if we had a different parser in bullet #3, this argument would have to be changed accordingly. 因此,如果项目3中的解析器不同,则必须相应地更改此参数。

  7. fusion::at_c< N >() in the inserter obtains the element of the vector at index N . inserter中的fusion::at_c< N >()获得索引N处的向量元素。 It is very similar to std::get< N >() when applied to std::tuple . 当应用于std::tuple时,它与std::get< N >()非常相似。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM