[英]Regex capture optional groups
I'm trying capture 2 groups of numbers, where each group is optional and should only be captured if contains numbers.我正在尝试捕获 2 组数字,其中每组都是可选的,并且只有在包含数字时才应捕获。 Here is a list of all valid combinations that it supposed to match:
这是它应该匹配的所有有效组合的列表:
123(456)
123
(456)
abc(456)
123(efg)
And these are not valid combinations and should not be matched:这些不是有效的组合,不应匹配:
abc(efg)
abc
(efg)
However, my regex fails on #4
and #5
combinations even though they contain numbers.但是,我的正则表达式在
#4
和#5
组合上失败,即使它们包含数字。
const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"]; const regex = /^(?:(\d+))?(?:\((\d+)\))?$/; list.map((a,i) => console.log(i+1+". ", a + "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
.as-console-wrapper{top:0;max-height:unset;important:overflow;auto!important;}
So, the question is why when used ?
那么,问题是为什么什么时候使用
?
behind a group, it doesn't "skip" that group if nothing matched?在一个组后面,如果没有匹配,它不会“跳过”那个组?
PS With this regex it also captures #4
, but not #5
: /(?:^|(\d+)?)(?:\((\d+)\))?$/
PS 使用这个正则表达式它也捕获
#4
,但不是#5
: /(?:^|(\d+)?)(?:\((\d+)\))?$/
A solution to what you're looking for can be done with lookahead, see:可以使用 lookahead 来解决您正在寻找的问题,请参阅:
(?=^\d+(?:\(|$))(\d+)|(?=\d+\)$)(\d+)
Rough translation: a number from the start ending with a bracket (or end of line) OR a number in brackets somewhere in the text粗略的翻译:从开头以括号(或行尾)结尾的数字或文本某处括号中的数字
To answer question on optional captured groups回答有关可选捕获组的问题
Yes, if a group is marked optional eg (A*)?
是的,如果一个组被标记为可选,例如
(A*)?
it does make the whole group optional.它确实使整个组可选。 In your case, it is simply a case of the regex not matching - even if the optional part isn't there (verify with the help of a regex debugger)
在你的情况下,这只是正则表达式不匹配的情况 - 即使可选部分不存在(在正则表达式调试器的帮助下验证)
@WiktorStribiżew and @akash had good ideas, but they are based on global flag, which requires additional loop to gather all the matches. @WiktorStribiżew 和@akash 有很好的想法,但它们基于全局标志,这需要额外的循环来收集所有匹配项。
For now, I come up with this regex, which matches anything, but it captures only what I need.现在,我提出了这个正则表达式,它可以匹配任何内容,但它只捕获我需要的内容。
const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"]; const regex = /(?:(\d+)|^|[^(]+)+?(?:\((?:(\d+)|\D*)\)|$)+?/; list.map((a,i) => console.log(i+1+". ", a + "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
.as-console-wrapper{top:0;max-height:unset;important:overflow;auto!important;}
Here an idea without global flag and supposed to only match the needed items:这里有一个没有全局标志的想法,应该只匹配需要的项目:
^(?=\D*\d)(\d+)?\D*(?:\((\d*)\))?\D*$
^(?=\D*\d)
The lookahead at ^
start checks for at least a digit ^(?=\D*\d)
^
start的前瞻检查至少一个数字(\d+)?
capturing the digits to the optional first group\D*
followed by any amount of non digits \D*
后跟任意数量的非数字(?:\((\d*)\))?
digits in parentheses to optional second group\D*$
matching any amount of \D
non digits up to the $
end \D*$
匹配任意数量的\D
非数字直到$
结束See your JS demo or a demo at regex101 (the [^\d\n]
only for multiline demo) 请参阅您的 JS 演示或regex101 上的演示(
[^\d\n]
仅适用于多行演示)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.