简体   繁体   English

正则表达式捕获,带有可选的下划线和数字

[英]Regular expression capture with optional trailing underscore and number

I'm trying to find a regular expression that will match the base string without the optional trailing number ( _123 ). 我试图找到一个与基本字符串匹配的正则表达式,而没有可选的尾随数字( _123 )。 eg: 例如:

lorem_ipsum_test1_123 -> capture lorem_ipsum_test1 lorem_ipsum_test1_123 >捕获lorem_ipsum_test1

lorem_ipsum_test2 -> capture lorem_ipsum_test2 lorem_ipsum_test2 >捕获lorem_ipsum_test2

I tried using the following expression, but it would only work when there is a trailing _number. 我尝试使用以下表达式,但仅在尾随_number时有效。 /(.+)(?>_[0-9]+)/ /(.+)(?>_[0-9]+)?/ Similarly, adding the ? /(.+)(?>_[0-9]+)/ /(.+)(?>_[0-9]+)?/类似地,添加? (zero or more) quantifier only worked when there is no trailing _number, otherwise, the trailing _number would just be part of the first capture. (零个或多个)量词仅在没有尾随_number时起作用,否则,尾随_number将只是第一次捕获的一部分。

Any suggestions? 有什么建议么?

You may use the following expression: 您可以使用以下表达式:

^(?:[^_]+_)+(?!\d+$)[^_]+
  • ^ Anchor beginning of string. ^锚定字符串的开头。
  • (?:[^_]+_)+ Repeated non capturing group. (?:[^_]+_)+重复的非捕获组。 Negated character set for anything other than a _ , followed by a _ . _之外的其他任何字符的取反字符集,后跟_
  • (?!\\d+$) Negative lookahead for digits at the end of the string. (?!\\d+$)字符串末尾的数字负向搜索。
  • [^_]+ Negated character set for anything other than a _ . [^_]+_以外的任何其他字符。

Regex demo here . 正则表达式演示在这里

Please note that the \\n in the character sets in the Regex demo are only for demonstration purposes, and should by all means be removed when using as a pattern in Javascript. 请注意,Regex演示中字符集中的\\n仅用于演示目的,在用作Javascript模式时,应将其全部删除。


Javascript demo: JavaScript演示:

 var myString = "lorem_ipsum_test1_123"; var myRegexp = /^(?:[^_]+_)+(?!\\d+$)[^_]+/g; var match = myRegexp.exec(myString); console.log(match[0]); var myString = "lorem_ipsum_test2" var myRegexp = /^(?:[^_]+_)+(?!\\d+$)[^_]+/g; var match = myRegexp.exec(myString); console.log(match[0]); 

You might match any character and use a negative lookahead that asserts that what follows is not an underscore, one or more digits and the end of the string: 您可以匹配任何字符,并使用否定的前瞻来断言其后不是下划线,一个或多个数字以及字符串的结尾:

^(?:(?!_\\d+$).)*

Explanation 说明

  • ^ Assert start of the string ^字符串的开始
  • (?: Non capturing group (?:非捕获组
    • (?! Negative lookahead to assert what is on the right side is not (?!负断言断言右边的内容不是
      • _\\d+$ Match an underscore, one or more digits and assert end of the string _\\d+$匹配一个下划线,一个或多个数字并断言字符串的结尾
    • .) Match any character and close negative lookahead .)匹配任何字符并关闭否定提前
  • )* Close non capturing group and repeat zero or more times )*关闭非捕获组并重复零次或多次

Regex demo 正则表达式演示

 const strings = [ "lorem_ipsum_test1_123", "lorem_ipsum_test2" ]; let pattern = /^(?:(?!_\\d+$).)*/; strings.forEach((s) => { console.log(s + " ==> " + s.match(pattern)[0]); }); 

You are asking for 你要

/^(.*?)(?:_\d+)?$/

See the regex demo . 参见regex演示 The point here is that the first dot pattern must be non-greedy and the _\\d+ should be wrapped with an optional non-capturing group and the whole pattern (especially the end) must be enclosed with anchors. 这里的要点是,第一个点图案必须是非贪婪的,并且_\\d+应该用一个可选的非捕获组包裹,并且整个图案(尤其是结尾)必须用锚点括起来。

Details 细节

  • ^ - start of string ^ -字符串的开头
  • (.*?) - Capturing group 1: any zero or more chars other than line break chars, as few as possible due to the non-greedy ( "lazy" ) quantifier *? (.*?) -捕获组1:除换行符以外的零个或多个字符,由于非贪婪( “ lazy” )量*?而应尽可能少
  • (?:_\\d+)? - an optional non-capturing group matching 1 or 0 occurrences of _ and then 1+ digits -可选的非捕获组,匹配1或0个出现的_ ,然后匹配1+个数字
  • $ - end of string. $ -字符串结尾。

However, it seems easier to use a mere replacing approach, 但是,使用单纯的替换方法似乎更容易,

s = s.replace(/_\d+$/, '')

If the string ends with _ and 1+ digits, the substring will get removed, else, the string will not change. 如果字符串以_和1+数字结尾,则子字符串将被删除,否则,字符串将保持不变。

See this regex demo . 请参阅此正则表达式演示

Try to check if the string contains the trailing number. 尝试检查字符串是否包含尾随数字。 If it does you get only the other part. 如果是这样,您只能得到另一部分。 Otherwise you get the whole string. 否则,您将得到整个字符串。

var str = "lorem_ipsum_test1_123"

if(/_[0-9]+$/.test(str)) {
   console.log(str.match(/(.+)(?=_[0-9]+)/g))
} else {
   console.log(str)
}

Or, a lot more concise: 或者,更简洁一些:

str = str.replace(/_[0-9]+$/g, "")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM