简体   繁体   English

C ++正则表达式子字符串发现错误模式

[英]c++ regex substring wrong pattern found

I'm trying to understand the logic on the regex in c++ 我正在尝试了解C ++ regex的逻辑

std::string s ("Ni Ni Ni NI");
std::regex e ("(Ni)");

std::smatch sm;  
std::regex_search (s,sm,e);
std::cout << "string object with " << sm.size() << " matches\n"; 

This form shouldn't give me the number of substrings matching my pattern? 这种形式不应该给我匹配我的模式的子串数量吗? Because it always give me 1 match and it says that the match is [Ni , Ni]; 因为它总是给我1个匹配项,并且说匹配项是[Ni , Ni]; but i need it to find every single pattern; 但是我需要它来找到每个单一的模式; they should be 3 and like this [Ni][Ni][Ni] 他们应该是3,像这样[Ni][Ni][Ni]

The function std::regex_search only returns the results for the first match found in your string. 函数std :: regex_search仅返回字符串中找到的第一个匹配项的结果。

Here is a code, merged from yours and from cplusplus.com . 这是从您的代码和cplusplus.com合并的代码。 The idea is to search for the first match, analyze it, and then start again using the rest of the string (that is to say, the sub-string that directly follows the match that was found, which can be retrieved thanks to match_results::suffix ). 这个想法是搜索第一个匹配项,对其进行分析,然后使用其余的字符串(也就是说,直接跟在找到的匹配项之后的子字符串)重新开始,这要归功于match_results: :suffix )。

Note that the regex has two capturing groups (Ni*) and ([^ ]*) . 请注意,正则表达式具有两个捕获组(Ni*)([^ ]*)

std::string s("the knights who say Niaaa and Niooo");
std::smatch m;
std::regex e("(Ni*)([^ ]*)");

while (std::regex_search(s, m, e))
{
    for (auto x : m)
        std::cout << x.str() << " ";

    std::cout << std::endl;
    s = m.suffix().str();
}

This gives the following output: 这给出以下输出:

Niaaa Ni aaa Niaaa Ni aaa

Niooo Ni ooo Niooo Ni ooo

As you can see, for every call to regex_search, we have the following information: 如您所见,对于每次对regex_search的调用,我们都有以下信息:

  • the content of the whole match, 整个比赛的内容,
  • the content of every capturing group. 每个捕获组的内容。

Since we have two capturing groups, this gives us 3 strings for every regex_search. 由于我们有两个捕获组,因此每个regex_search都有3个字符串。

EDIT : in your case if you want to retrieve every "Ni", all you need to do is to replace 编辑 :如果您要检索每个“ Ni”,则只需替换

std::regex e("(Ni*)([^ ]*)");

with

std::regex e("(Ni)");

You still need to iterate over your string, though. 不过,您仍然需要遍历字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM