[英]Regular expression capture with optional trailing underscore and number
I'm trying to find a regular expression that will match the base string without the optional trailing number ( _123
). 我试图找到一个与基本字符串匹配的正则表达式,而没有可选的尾随数字( _123
)。 eg: 例如:
lorem_ipsum_test1_123
-> capture lorem_ipsum_test1 lorem_ipsum_test1_123
>捕获lorem_ipsum_test1
lorem_ipsum_test2
-> capture lorem_ipsum_test2 lorem_ipsum_test2
>捕获lorem_ipsum_test2
I tried using the following expression, but it would only work when there is a trailing _number. 我尝试使用以下表达式,但仅在尾随_number时有效。 /(.+)(?>_[0-9]+)/ /(.+)(?>_[0-9]+)?/
Similarly, adding the ?
/(.+)(?>_[0-9]+)/ /(.+)(?>_[0-9]+)?/
类似地,添加?
(zero or more) quantifier only worked when there is no trailing _number, otherwise, the trailing _number would just be part of the first capture. (零个或多个)量词仅在没有尾随_number时起作用,否则,尾随_number将只是第一次捕获的一部分。
Any suggestions? 有什么建议么?
You may use the following expression: 您可以使用以下表达式:
^(?:[^_]+_)+(?!\d+$)[^_]+
^
Anchor beginning of string. ^
锚定字符串的开头。 (?:[^_]+_)+
Repeated non capturing group. (?:[^_]+_)+
重复的非捕获组。 Negated character set for anything other than a _
, followed by a _
. 除_
之外的其他任何字符的取反字符集,后跟_
。 (?!\\d+$)
Negative lookahead for digits at the end of the string. (?!\\d+$)
字符串末尾的数字负向搜索。 [^_]+
Negated character set for anything other than a _
. [^_]+
除_
以外的任何其他字符。 Regex demo here . 正则表达式演示在这里 。
Please note that the \\n
in the character sets in the Regex demo are only for demonstration purposes, and should by all means be removed when using as a pattern in Javascript. 请注意,Regex演示中字符集中的\\n
仅用于演示目的,在用作Javascript模式时,应将其全部删除。
Javascript demo: JavaScript演示:
var myString = "lorem_ipsum_test1_123"; var myRegexp = /^(?:[^_]+_)+(?!\\d+$)[^_]+/g; var match = myRegexp.exec(myString); console.log(match[0]); var myString = "lorem_ipsum_test2" var myRegexp = /^(?:[^_]+_)+(?!\\d+$)[^_]+/g; var match = myRegexp.exec(myString); console.log(match[0]);
You might match any character and use a negative lookahead that asserts that what follows is not an underscore, one or more digits and the end of the string: 您可以匹配任何字符,并使用否定的前瞻来断言其后不是下划线,一个或多个数字以及字符串的结尾:
^(?:(?!_\\d+$).)*
Explanation 说明
^
Assert start of the string ^
字符串的开始 (?:
Non capturing group (?:
非捕获组
(?!
Negative lookahead to assert what is on the right side is not (?!
负断言断言右边的内容不是
_\\d+$
Match an underscore, one or more digits and assert end of the string _\\d+$
匹配一个下划线,一个或多个数字并断言字符串的结尾 .)
Match any character and close negative lookahead .)
匹配任何字符并关闭否定提前 )*
Close non capturing group and repeat zero or more times )*
关闭非捕获组并重复零次或多次 const strings = [ "lorem_ipsum_test1_123", "lorem_ipsum_test2" ]; let pattern = /^(?:(?!_\\d+$).)*/; strings.forEach((s) => { console.log(s + " ==> " + s.match(pattern)[0]); });
You are asking for 你要
/^(.*?)(?:_\d+)?$/
See the regex demo . 参见regex演示 。 The point here is that the first dot pattern must be non-greedy and the _\\d+
should be wrapped with an optional non-capturing group and the whole pattern (especially the end) must be enclosed with anchors. 这里的要点是,第一个点图案必须是非贪婪的,并且_\\d+
应该用一个可选的非捕获组包裹,并且整个图案(尤其是结尾)必须用锚点括起来。
Details 细节
^
- start of string ^
-字符串的开头 (.*?)
- Capturing group 1: any zero or more chars other than line break chars, as few as possible due to the non-greedy ( "lazy" ) quantifier *?
(.*?)
-捕获组1:除换行符以外的零个或多个字符,由于非贪婪( “ lazy” )量*?
而应尽可能少 (?:_\\d+)?
- an optional non-capturing group matching 1 or 0 occurrences of _
and then 1+ digits -可选的非捕获组,匹配1或0个出现的_
,然后匹配1+个数字 $
- end of string. $
-字符串结尾。 However, it seems easier to use a mere replacing approach, 但是,使用单纯的替换方法似乎更容易,
s = s.replace(/_\d+$/, '')
If the string ends with _
and 1+ digits, the substring will get removed, else, the string will not change. 如果字符串以_
和1+数字结尾,则子字符串将被删除,否则,字符串将保持不变。
See this regex demo . 请参阅此正则表达式演示 。
Try to check if the string contains the trailing number. 尝试检查字符串是否包含尾随数字。 If it does you get only the other part. 如果是这样,您只能得到另一部分。 Otherwise you get the whole string. 否则,您将得到整个字符串。
var str = "lorem_ipsum_test1_123"
if(/_[0-9]+$/.test(str)) {
console.log(str.match(/(.+)(?=_[0-9]+)/g))
} else {
console.log(str)
}
Or, a lot more concise: 或者,更简洁一些:
str = str.replace(/_[0-9]+$/g, "")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.