简体   繁体   中英

Regex not matching hash character between word boundary

I've got a strange behaviour with the regex \\bC#?\\b

string s1 = Regex.Replace("Bla Ca bla", @"\bCa?\b", "[$0]"); // Bla [Ca] bla (as expected)
string s2 = Regex.Replace("Bla C# bla", @"\bC#?\b", "[$0]"); // Bla [C]# bla (why???)

Does anyone understand why it happens and how to match an optional # at the end?

Because \\b is marking the boundaries of the word. And in regexes word is considered a sequence of alphanumeric symbols (see here ), other characters not included. In first example a is a letter, so Ca is a word. In second # is not an alphanumeric character, thus word consists of only C .

To see the difference, try removing \\b :

string s2 = Regex.Replace("Bla C# bla", @"C#?", "[$0]"); // Bla [C#] bla

If you need \\b kind of boundary - check out this thread with some suggestions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM