简体   繁体   English

使用正则表达式标记 C++ 中的 stream

[英]Tokenize a stream in C++ using regex

I would like to tokenize a stream in C++ using a regular expression similar to the way this is done for a string:我想使用类似于对字符串执行此操作的方式的正则表达式在 C++ 中对 stream 进行标记:

std::vector<std::string> tokenize(const std::string& source, const std::regex& re)
{
    auto tokens = std::vector<std::string>(
        std::sregex_token_iterator{ begin(source), end(source), re, -1 },
        std::sregex_token_iterator{}
    );
    return tokens;
}

The difference would be passing an istream as the source variable.不同之处在于将istream作为source变量传递。

I could first copy the contents of the stream into a string, and then perform the tokenization, but this seems to be inefficient.我可以先将 stream 的内容复制到一个字符串中,然后执行标记化,但这似乎效率低下。

transform_reduce(istream_iterator<string>(cin), istream_iterator<string>(),
               vector<string>{},
               [](auto&& a, auto&& b) {
                 auto acc = [](auto&& h, auto&& w) { h.emplace_back(w); return move(h); };
                 if constexpr (is_same_v<decay_t<decltype(a)>, vector<string>>)
                   return accumulate(istream_iterator<string>(b), istream_iterator<string>(), move(a), acc);
                 else
                   return accumulate(istream_iterator<string>(a), istream_iterator<string>(), move(b), acc);
               },
               [&](auto l) { replace_if(l.begin(), l.end(), is_word_seperators, ' '); return stringstream{move(l)}; }
             );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM