简体   繁体   English

.NET正则表达式中带有单词边界的正则表达式失败

[英]regular expressions with word boundaries fail in .NET Regex

I have a problem with .NET regular expressions in C#. 我对C#中的.NET正则表达式有疑问。 We are trying to match special tokens in a text surrounded by the paragraph sign § . 我们正在尝试在段落符号§包围的文本中匹配特殊标记。 For completeness those the corresponding regular expressions are surrounded by word boundaries \\b . 为了完整起见,那些相应的正则表达式被单词边界\\b包围。 The problem is that the regular expression surrounded by \\b does not match words: 问题是\\b包围的正则表达式与单词不匹配:

    static void Main(string[] args)
    {
        string data = "I would like to replace this §pattern§ with something interesting";
        string requiredResult = "I would like to replace this serious text with something interesting";

        Regex regSuccess = new Regex("§pattern§");
        Regex regFail = new Regex(@"\b§pattern§\b");

        var dataSuccess = regSuccess.Replace(data, "serious text");
        var dataFail = regFail.Replace(data, "serious text");

        Console.WriteLine("regSuccess match: {0}", dataSuccess == requiredResult);
        Console.WriteLine("regFail match: {0}", dataFail == requiredResult);
        Console.WriteLine("Press enter to continue");
        var line = Console.ReadLine();
    }

As you can see, dataFail == requiredResult returns false . 如您所见, dataFail == requiredResult返回false

Replace 更换

Regex regFail = new Regex(@"\b§pattern§\b");

with

Regex regFail = new Regex(@"§\bpattern\b§");

§ is a non-word character, thus, \\b prevents pattern from being matched. §是一个非单词字符,因此\\b防止pattern匹配。 Perhaps, you do not even need the \\b here since the pattern is already inside the non-word characters? 也许,由于该pattern已经在非单词字符中,您甚至不需要在这里\\b了?

Regex regFail = new Regex(@"§pattern§");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM