简体   繁体   English

正则表达式捕获可选组

[英]Regex capture optional groups

I'm trying capture 2 groups of numbers, where each group is optional and should only be captured if contains numbers.我正在尝试捕获 2 组数字,其中每组都是可选的,并且只有在包含数字时才应捕获。 Here is a list of all valid combinations that it supposed to match:这是它应该匹配的所有有效组合的列表:

  1. 123(456)
  2. 123
  3. (456)
  4. abc(456)
  5. 123(efg)

And these are not valid combinations and should not be matched:这些不是有效的组合,不应匹配:

  1. abc(efg)
  2. abc
  3. (efg)

However, my regex fails on #4 and #5 combinations even though they contain numbers.但是,我的正则表达式在#4#5组合上失败,即使它们包含数字。

 const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"]; const regex = /^(?:(\d+))?(?:\((\d+)\))?$/; list.map((a,i) => console.log(i+1+". ", a + "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
 .as-console-wrapper{top:0;max-height:unset;important:overflow;auto!important;}

So, the question is why when used ?那么,问题是为什么什么时候使用? behind a group, it doesn't "skip" that group if nothing matched?在一个组后面,如果没有匹配,它不会“跳过”那个组?

PS With this regex it also captures #4 , but not #5 : /(?:^|(\d+)?)(?:\((\d+)\))?$/ PS 使用这个正则表达式它也捕获#4 ,但不是#5/(?:^|(\d+)?)(?:\((\d+)\))?$/

A solution to what you're looking for can be done with lookahead, see:可以使用 lookahead 来解决您正在寻找的问题,请参阅:

(?=^\d+(?:\(|$))(\d+)|(?=\d+\)$)(\d+)

Rough translation: a number from the start ending with a bracket (or end of line) OR a number in brackets somewhere in the text粗略的翻译:从开头以括号(或行尾)结尾的数字或文本某处括号中的数字

To answer question on optional captured groups回答有关可选捕获组的问题

Yes, if a group is marked optional eg (A*)?是的,如果一个组被标记为可选,例如(A*)? it does make the whole group optional.它确实使整个组可选。 In your case, it is simply a case of the regex not matching - even if the optional part isn't there (verify with the help of a regex debugger)在你的情况下,这只是正则表达式不匹配的情况 - 即使可选部分不存在(在正则表达式调试器的帮助下验证)

@WiktorStribiżew and @akash had good ideas, but they are based on global flag, which requires additional loop to gather all the matches. @WiktorStribiżew 和@akash 有很好的想法,但它们基于全局标志,这需要额外的循环来收集所有匹配项。

For now, I come up with this regex, which matches anything, but it captures only what I need.现在,我提出了这个正则表达式,它可以匹配任何内容,但它只捕获我需要的内容。

 const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"]; const regex = /(?:(\d+)|^|[^(]+)+?(?:\((?:(\d+)|\D*)\)|$)+?/; list.map((a,i) => console.log(i+1+". ", a + "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
 .as-console-wrapper{top:0;max-height:unset;important:overflow;auto!important;}

Here an idea without global flag and supposed to only match the needed items:这里有一个没有全局标志的想法,应该只匹配需要的项目:

^(?=\D*\d)(\d+)?\D*(?:\((\d*)\))?\D*$
  • ^(?=\D*\d) The lookahead at ^ start checks for at least a digit ^(?=\D*\d) ^ start前瞻检查至少一个数字
  • (\d+)? capturing the digits to the optional first group将数字捕获可选第一组
  • \D* followed by any amount of non digits \D*后跟任意数量的非数字
  • (?:\((\d*)\))? digits in parentheses to optional second group括号中的数字到可选的第二组
  • \D*$ matching any amount of \D non digits up to the $ end \D*$匹配任意数量的\D非数字直到$结束

See your JS demo or a demo at regex101 (the [^\d\n] only for multiline demo) 请参阅您的 JS 演示regex101 上的演示[^\d\n]仅适用于多行演示)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM