[英]Correct Regular Expression match containing optional substrings
I have the following set of strings: 我有以下一组字符串:
some_param[name]
some_param_0[name]
I wish to capture some_param , 0 , name from them. 我希望从中捕获some_param , 0 , 名称 。 My regex knowledge is pretty weak. 我的正则表达式知识很薄弱。 I tried the following, but it doesn't work for both cases. 我尝试了以下方法,但在两种情况下均无效。
/^(\D+)_?(\d{0,2})\[?(.*?)\]?$/.exec("some_param_0[name]") //works except for the trailing underscore on "some_param"
What would be the correct regex? 什么是正确的正则表达式?
/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/
(\\w+?)
uses a non-greedy quantifier to capture the identifier part without any trailing _
. (\\w+?)
使用非贪婪量词来捕获标识符部分,而不会在结尾加上_
。
_?
is greedy so will beat the +?
是贪婪的,所以会击败+?
in the previous part. 在上一部分中。
(\\d{0,2})
will capture 0-2 digits. (\\d{0,2})
将捕获0-2个数字。 It is greedy, so even if there is no _
between the identifier and digits, this will capture digits. 这是贪婪的,因此即使标识符和数字之间没有_
,也将捕获数字。
(?:...)?
makes the square bracketed section optional. 使方括号部分为可选。
\\[([^\\[\\]]*)\\]
captures the contents of a square bracketed section that does not itself contain square brackets. \\[([^\\[\\]]*)\\]
捕获方括号部分的内容,该部分本身不包含方括号。
'some_param_0[name]'.match(/^(\w+?)_(\d{0,2})(?:\[([^\[\]]*)\])?$/)
produces an array like: 产生一个像这样的数组:
["some_param_0[name]", // The matched content in group 0.
"some_param", // The portion before the digits in group 1.
"0", // The digits in group 2.
"name"] // The contents of the [...] in group 3.
Note that the non-greedy quantifier might interact strangely with the bounded repetition in \\d{0,2}
. 请注意,非贪婪量词可能与\\d{0,2}
的有界重复奇怪地相互作用。
'x1234[y]'.match(/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/)
yields 产量
["x1234[y]","x12","34","y"]
Got it! 得到它了! (taking from Mike's answer): (摘自Mike的回答):
/^(\D+)(?:_(\d+))?(?:\[([^\]]*)\])/
'some_param[name]' => ('some_param', None, 'name')
'some_param_0[name]' => ('some_param', '0', 'name')
(at least, in Python it works) (至少在Python中有效)
UPDATE: A little extra I wrote fiddling with it, by making the result cleaner by using named groups: 更新:通过使用命名组使结果更整洁,我写了些玩意儿:
^(?P<param>\D+)(?:_(?P<id>\d+))?(?:\[(?P<key>[^\]]*)\])
UPDATE: 更新:
/^([A-Za-z_]+)(?:_(\\d+))?(?:\\[([^\\]]*)\\])?$/
, that seems to work in a lot of cases. 我想到的最终正则表达式是: /^([A-Za-z_]+)(?:_(\\d+))?(?:\\[([^\\]]*)\\])?$/
,这似乎在很多情况下都有效。 请检查以下正则表达式“(\\ w +)_(\\ d)[(\\ w +)]”可以在http://rubular.com/上进行测试
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.