[英]a regex expression to extract 1st group + 2nd group or 1st group only if no 2nd group (including variations)
最好的解释方式是确切显示我要实现的目标:
案例1: "search for fenway park in boston"
摘录:第1组-> "fenway park"
,第2组-> "boston"
案例2: "search for fenway park"
摘录:第1组-> "fenway park"
请注意,在两种情况下,我都希望能够满足"search"
( "look for"
, "find"
等)和"in"
( "at"
, "around"
等)的变化。 ..)。
我尝试了许多不同的变体,但要么以在第1组中提取"fenway park in boston"
为最终结果,但在第2组中没有任何结果,或者如果我对情况1正确,则情况2将不起作用。
这应该为你工作
^(?:search for|look for|find)\s*(.*?)(?:\s*(?:in|around|at)\s*(.*))?$
您可以通过将moer 或子句添加到非捕获组来添加诸如look for/in/at
的更多子句。
说明:
@"
^ # Assert position at the beginning of a line (at beginning of the string or after a line break character)
(?: # Match the regular expression below
# Match either the regular expression below (attempting the next alternative only if this one fails)
search\ for # Match the characters “search for” literally
| # Or match regular expression number 2 below (attempting the next alternative only if this one fails)
look\ for # Match the characters “look for” literally
| # Or match regular expression number 3 below (the entire group fails if this one fails to match)
find # Match the characters “find” literally
)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
( # Match the regular expression below and capture its match into backreference number 1
. # Match any single character that is not a line break character
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
)
(?: # Match the regular expression below
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?: # Match the regular expression below
# Match either the regular expression below (attempting the next alternative only if this one fails)
in # Match the characters “in” literally
| # Or match regular expression number 2 below (attempting the next alternative only if this one fails)
around # Match the characters “around” literally
| # Or match regular expression number 3 below (the entire group fails if this one fails to match)
at # Match the characters “at” literally
)
\s # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
( # Match the regular expression below and capture its match into backreference number 2
. # Match any single character that is not a line break character
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
)? # Between zero and one times, as many times as possible, giving back as needed (greedy)
$ # Assert position at the end of a line (at the end of the string or before a line break character)
"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.