简体   繁体   English

regexec 不返回多个匹配项

[英]regexec does not return multiple matches

Tried to learn posix regex with this example enter link description here and my own regex and text.尝试通过此示例学习 posix 正则表达式,请在此处输入链接描述以及我自己的正则表达式和文本。

    const char * regex_text = "[[:digit:]]{2}\\:[[:digit:]]{2}\\:[[:digit:]]{2},[[:digit:]]{3}";
    const char * find_text = "00:01:54,644 --> 00:01:56,714 --> 00:02:58,589";

The output:输出:

Trying to find '[[:digit:]]{2}\:[[:digit:]]{2}\:[[:digit:]]{2},[[:digit:]]{3}' in '00:01:54,644 --> 00:01:56,714 --> 00:02:58,589'
$& is '00:01:54,644' (bytes 0:12)
$& is '00:01:56,714' (bytes 17:29)
$& is '00:02:58,589' (bytes 34:46)
No more matches.

My question is why only one match was found in each of the for loops?我的问题是为什么在每个for循环中只找到一个匹配项? And instead, the while loop did the job.相反, while循环完成了这项工作。 Shouldn't one regexec return all matches to m ?一个regexec不应该将所有匹配项返回给m吗?

The for loop would catch all the capture groups within a match (groups enclosed in parentheses). for 循环将捕获匹配项中的所有捕获组(括在括号中的组)。 So if you had written所以如果你写了

([[:digit:]]{2}\\:[[:digit:]]{2}\\:[[:digit:]]{2},[[:digit:]]{3}) --> ([[:digit:]]{2}\\:[[:digit:]]{2}\\:[[:digit:]]{2},[[:digit:]]{3}) --> ([[:digit:]]{2}\\:[[:digit:]]{2}\\:[[:digit:]]{2},[[:digit:]]{3})

as your regex, your three timestamps would show up in $1, $2, and $3.作为您的正则表达式,您的三个时间戳将显示为 1 美元、2 美元和 3 美元。

In your code, however, the regex matches only one timestamp.但是,在您的代码中,正则表达式仅匹配一个时间戳。 If you want to catch the next one, you need to execute a new match, which is what the while loop does.如果你想捕捉下一个,你需要执行一个新的匹配,这就是while循环所做的。

To specifically answer the question, it is normal that a single call to regexec() only returns the first match of the regex, hence the need for an outer loop to iterate through all matches.为了具体回答这个问题,对regexec()的单个调用只返回正则表达式的第一个匹配项是正常的,因此需要一个外循环来遍历所有匹配项。

The confusion comes from the fact that the regmatch_t array only describes one match of the regex (is is an array because it has to contain the offsets of the match itself, and the offsets of each sub-expression within that match).混淆来自这样一个事实,即regmatch_t数组只描述了正则表达式的一个匹配项(is 是一个数组,因为它必须包含匹配项本身的偏移量,以及该匹配项中每个子表达式的偏移量)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM