简体   繁体   English

正则表达式匹配不被括号包围的字符串

[英]Regex to match a string NOT surrounded by brackets

I have to parse a text where with is a key word if it is not surrounded by square brackets.如果没有用方括号括起来,我必须解析一个带有关键字的文本。 I have to match the keyword with .我必须将关键字. Also, there must be word boundaries on both side of with .此外, with两侧必须有单词边界。

Here are some examples where with is NOT a keyword:以下是一些with不是关键字的示例:

  • [with] [和]
  • [ with ] [ 和 ]
  • [sometext with sometext] [sometext with sometext]
  • [sometext with] [与]
  • [with sometext] [有一些文字]

Here are some examples where with IS keyword以下是一些使用IS 关键字的示例

  • with
  • ] with ] 和
  • hello with你好
  • hello with world你好与世界
  • hello [ world] with hello你好 [世界] 与你好
  • hello [ world] with hello [world]你好 [世界] 和你好 [世界]

Anyone to help?有人帮忙吗? Thanks in advance.提前致谢。

You can look for the word with and see that the closest bracket to its left side is not an opening bracket, and that the closest bracket to its right side is not a closing bracket:您可以查找单词with并看到其左侧最近的括号不是左括号,并且其右侧最近的括号不是右括号:

Regex regexObj = new Regex(
    @"(?<!     # Assert that we can't match this before the current position:
     \[        #  An opening bracket
     [^[\]]*   #  followed by any other characters except brackets.
    )          # End of lookbehind.
    \bwith\b   # Match ""with"".
    (?!        # Assert that we can't match this after the current position:
     [^[\]]*   #  Any text except brackets
     \]        #  followed by a closing bracket.
    )          # End of lookahead.", 
    RegexOptions.IgnorePatternWhitespace);
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
    // matched text: matchResults.Value
    // match start: matchResults.Index
    // match length: matchResults.Length
    matchResults = matchResults.NextMatch();
}

The lookaround expressions don't stop at line breaks;环视表达式不会在换行符处停止; if you want each line to be evaluated separately, use [^[\]\r\n]* instead of [^[\]]* .如果您希望单独评估每一行,请使用[^[\]\r\n]*而不是[^[\]]*

Nice question.好问题。 I think it'll be easier to find the matches where your [with] pattern applies, and then inverse the result.我认为找到[with]模式适用的匹配项会更容易,然后反转结果。

You need to match [ , not followed by ] , followed by with (and then the corresponding pattern for closed square bracket)您需要匹配[ ,而不是] ,然后是with (然后是闭合方括号的相应模式)

Matching the [ and the with are easy.匹配[with很容易。

\[with

add a lookahead to exclude ] , and also allow any number of other characters ( .* )添加前瞻以排除] ,并且还允许任意数量的其他字符 ( .* )

\[(?!]).*with

then the corresponding closed square bracket, ie the reverse with a lookbehind.然后是相应的封闭方括号,即向后看的反向。

\[(?!]).*with.*\](?<1[)

a bit more tweaking多一点调整

\[(?!(.*\].*with)).*with.*\](?<!(with.*\[.*))

and now if you inverse this, you should have your desired result.现在如果你把它反转,你应该得到你想要的结果。 (ie when this returns 'true', your pattern matches and want to exclude those results). (即,当返回“true”时,您的模式匹配并希望排除这些结果)。

I think the simplest solution is to preemptively match balanced pairs of brackets and their contents to get them out of the way as you search for the keyword.我认为最简单的解决方案是先发制人地匹配平衡的括号对及其内容,以便在您搜索关键字时将它们排除在外。 Here's an example:这是一个例子:

string s = 
  @"[with0]
  [ with0 ]
  [sometext with0 sometext]
  [sometext with0]
  [with0 sometext]


  with1
  ] with1
  hello with1
  hello with1 world
  hello [ world] with1 hello
  hello [ world] with1 hello [world]";

Regex r = new Regex(@"\[[^][]*\]|(?<KEYWORD>\bwith\d\b)");
foreach (Match m in r.Matches(s))
{
  if (m.Groups["KEYWORD"].Success)
  {
    Console.WriteLine(m.Value);
  }
}

You'll want to look into both negative look-behinds and negative look-aheads, this will help you match your data without consuming the brackets.您需要同时查看负后瞻和负前瞻,这将帮助您匹配数据而无需使用括号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM