简体   繁体   English

包含可选子字符串的正确正则表达式匹配

[英]Correct Regular Expression match containing optional substrings

I have the following set of strings: 我有以下一组字符串:

some_param[name] 
some_param_0[name]

I wish to capture some_param , 0 , name from them. 我希望从中捕获some_param0名称 My regex knowledge is pretty weak. 我的正则表达式知识很薄弱。 I tried the following, but it doesn't work for both cases. 我尝试了以下方法,但在两种情况下均无效。

/^(\D+)_?(\d{0,2})\[?(.*?)\]?$/.exec("some_param_0[name]") //works except for the trailing underscore on "some_param"

What would be the correct regex? 什么是正确的正则表达式?

/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/

(\\w+?) uses a non-greedy quantifier to capture the identifier part without any trailing _ . (\\w+?)使用非贪婪量词来捕获标识符部分,而不会在结尾加上_

_? is greedy so will beat the +? 是贪婪的,所以会击败+? in the previous part. 在上一部分中。

(\\d{0,2}) will capture 0-2 digits. (\\d{0,2})将捕获0-2个数字。 It is greedy, so even if there is no _ between the identifier and digits, this will capture digits. 这是贪婪的,因此即使标识符和数字之间没有_ ,也将捕获数字。

(?:...)? makes the square bracketed section optional. 使方括号部分为可选。

\\[([^\\[\\]]*)\\] captures the contents of a square bracketed section that does not itself contain square brackets. \\[([^\\[\\]]*)\\]捕获方括号部分的内容,该部分本身不包含方括号。

'some_param_0[name]'.match(/^(\w+?)_(\d{0,2})(?:\[([^\[\]]*)\])?$/)

produces an array like: 产生一个像这样的数组:

["some_param_0[name]",  // The matched content in group 0.
 "some_param",          // The portion before the digits in group 1.
 "0",                   // The digits in group 2.
 "name"]                // The contents of the [...] in group 3.

Note that the non-greedy quantifier might interact strangely with the bounded repetition in \\d{0,2} . 请注意,非贪婪量词可能与\\d{0,2}的有界重复奇怪地相互作用。

'x1234[y]'.match(/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/)

yields 产量

["x1234[y]","x12","34","y"]

Got it! 得到它了! (taking from Mike's answer): (摘自Mike的回答):

/^(\D+)(?:_(\d+))?(?:\[([^\]]*)\])/

'some_param[name]' => ('some_param', None, 'name')
'some_param_0[name]' => ('some_param', '0', 'name')

(at least, in Python it works) (至少在Python中有效)

UPDATE: A little extra I wrote fiddling with it, by making the result cleaner by using named groups: 更新:通过使用命名组使结果更整洁,我写了些玩意儿:

^(?P<param>\D+)(?:_(?P<id>\d+))?(?:\[(?P<key>[^\]]*)\])

UPDATE: 更新:

请检查以下正则表达式“(\\ w +)_(\\ d)[(\\ w +)]”可以在http://rubular.com/上进行测试

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM