简体   繁体   English

C# 中的正则表达式 OR 运算符问题

[英]Problem with RegEx OR operator in C#

I want to match a pattern [0-9][0-9]KK[az][az] which is not preceded by either of these words我想匹配一个模式[0-9][0-9]KK[az][az]前面没有这两个词

  • http:// http://

  • example例子

I have a RegEx which takes care of the first criteria, but not the second criteria.我有一个 RegEx 处理第一个标准,但不是第二个标准。

Without OR operator没有 OR 运算符

var body = Regex.Replace(body, "(?<!http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@\\#\\$\\%

\\^\\&amp;\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?)([0-9][0-9]KK[a-z][a-z])

(?!</a>)","replaced");

wth OR Operator带有 OR 运算符

var body = Regex.Replace(body, "(?example)|(?<!http://([\\w+?\\.\\w+])+([a-zA-Z0-9\\~\\!\\@

\\#\\$\\%\\^\\&amp;\\*\\(\\)_\\-\\=\\+\\\\\\/\\?\\.\\:\\;\\'\\,]*)?)([0-9][0-9]KK[a-

z][a-z])(?!</a>)","replaced");

The second one with OR operator throws an exception.第二个使用 OR 运算符会引发异常。 How can I fix this?我怎样才能解决这个问题?

It should not match either of these:它不应该与以下任何一个匹配:

Here is one way to do it.这是一种方法。 Start at the beginning of the string and check that each character is not the start of 'http://' or 'example' .从字符串的开头开始并检查每个字符是否不是'http://''example'的开头。 Do this lazily, and one character at a time so that we can spot the magic word once we reach it.懒惰地做这个,一次一个字符,这样我们一旦到达它就可以发现这个神奇的词。 Also, capture everything up to the magic word so that we can put it back in the replacement string.此外,捕捉到魔法词的所有内容,以便我们可以将其放回替换字符串中。 Here it is in commented free-spacing mode so that it can be comprehended by mere mortals:这里它处于注释的自由间距模式,以便普通人可以理解:

var body = Regex.Replace(body, 
    @"# Match special word not preceded by 'http://' or 'example'
    ^                           # Anchor to beginning of string
    (?i)                        # Set case-insensitive mode.
    (                           # $1: Capture everything up to  special word.
      (?:                       # Non-capture group for applying * quantifier.
        (?!http://)             # Assert this char is not start of 'http://'
        (?!example)             # Assert this char is not start of 'example'
        .                       # Safe to match this one acceptable char.
      )*?                       # Lazily match zero or more preceding chars.
    )                           # End $1: Everything up to  special word.
    (?-i)                       # Set back to case-sensitive mode.
    ([0-9][0-9]KK[a-z][a-z])    # $2: Match our special word.
    (?!</a>)                    # Assert not end of Anchor tag contents.
    ", 
    "$1replaced",
    RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace);

Note that this is case sensitive for the magic word but not for http:// and example .请注意,这对于魔术词是区分大小写的,但对于http://example不区分大小写。 Note also that this is untested (I don't know C# - just its regex engine).另请注意,这是未经测试的(我不知道 C# - 只是它的正则表达式引擎)。 The "var" in "var body =..." looks kinda suspicious to me. "var" "var body =..."中的“var”对我来说有点可疑。 ?? ??

I wasn't able to get the second example working, it gave an ArgumentException of "Unrecognized grouping construct".我无法让第二个示例正常工作,它给出了“无法识别的分组构造”的 ArgumentException。

But I replaced the url matching and moved the first alternative group a bit and came up with this:但是我替换了 url 匹配,并稍微移动了第一个替代组并想出了这个:

var body = Regex.Replace(body, "(?<!http\\://[a-zA-Z0-9\\-\\.]+\\.[a-zA-Z]{2,3}(/\\S*)?|example)
([0-9][0-9]KK[a-z][a-z])(?!</a>)","replaced");

You could use something like this:你可以使用这样的东西:

body = Regex.Replace(body, @"(?<!\S)(?!(?i:http://|example))\S*\d\dKK[a-z]{2}\b", "replaced");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM