简体   繁体   English

正则表达式捕获可选字符

[英]Regex to capture optional characters

I want to pull out a base string (Wax) or (noWax) from a longer string, along with potentially any data before and after if the string is Wax.我想从一个较长的字符串中提取一个基本字符串 (Wax) 或 (noWax),如果该字符串是 Wax,则可能还有前后的任何数据。 I'm having trouble getting the last item in my list below (noWax) to match.我无法匹配下面列表中的最后一项 (noWax)。

Can anyone flex their regex muscles?任何人都可以展示他们的正则表达式肌肉吗? I'm fairly new to regex so advice on optimization is welcome as long as all matches below are found.我是正则表达式的新手,因此只要找到以下所有匹配项,就欢迎提供优化建议。

What I'm working with in Regex101:我在 Regex101 中使用的是:


/(?<Wax>Wax(?:Only|-?\d+))/mg

Original string原始字符串 need to extract in a capturing group需要在捕获组中提取
Loc3_341001_WaxOnly_S212 Loc3_341001_WaxOnly_S212 WaxOnly纯蜡
Loc4_34412-a_Wax4_S231 Loc4_34412-a_Wax4_S231 Wax4蜡4
Loc3a_231121-a_Wax-4-S451 Loc3a_231121-a_Wax-4-S451 Wax-4蜡4
Loc3_34112_noWax_S311 Loc3_34112_noWax_S311 noWax无蜡

Here is one way to do so, using a conditional :这是一种使用 条件的方法:

(?<Wax>(no)?Wax(?(2)|(?:Only|-?\d+)))

See the online demo .请参阅在线演示


  • (no)? : Optional capture group. : 可选的捕获组。
  • (? If. (?如果.
    • (2) : Test if capture group 2 exists ( (no) ). (2) :测试捕获组 2 是否存在 ( (no) )。 If it does, do nothing.如果是,则什么也不做。
    • | : Or. : 或者。
    • (?:Only|-?\d+)

I assume the following match is desired.我假设需要以下匹配。

  • the match must include 'Wax'比赛必须包括'Wax'
  • 'Wax' is to be preceded by '_' or by '_no' . 'Wax'前面要有'_''_no' If the latter 'no' is included in the match.如果匹配中包含后者'no'
  • 'Wax' may be followed by: 'Wax'后面可能跟有:
    • 'Only' followed by '_' , in which case 'Only' is part of the match, or 'Only'后跟'_' ,在这种情况下'Only'是匹配项的一部分,或者
    • one or more digits, followed by '_' , in which case the digits are part of the match, or一个或多个数字,后跟'_' ,在这种情况下,数字是匹配项的一部分,或者
    • '-' followed by one or more digits, followed by '-' , in which case '-' followed by one or more digits is part of the match. '-'后跟一个或多个数字,然后是'-' ,在这种情况下, '-'后跟一个或多个数字是匹配的一部分。

If these assumptions are correct the string can be matched against the following regular expression:如果这些假设是正确的,则字符串可以与以下正则表达式匹配:

(?<=_)(?:(?:no)?Wax(?:(?:Only|\d+)?(?=_)|\-\d+(?=-)))

Demo演示

The regular expression can be broken down as follows.正则表达式可以分解如下。

(?<=_)            # positive lookbehind asserts previous character is '_'
(?:               # begin non-capture group
  (?:no)?         # optionally match 'no'
  Wax             # match literal
  (?:             # begin non-capture group
    (?:Only|\d+)? # optionally match 'Only' or >=1 digits
    (?=_)         # positive lookahead asserts next character is '_'
    |             # or
    \-\d+         # match '-' followed by >= 1 digits
    (?=-)         # positive lookahead asserts next character is '-'
  )               # end non-capture group
)                 # end non-capture group

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM