[英]Why don't boundaries work in regex brackets?
To begin, I would like to note that a similar question exists with answers and workarounds specific to PHP. 首先,我想指出,存在一个类似的问题 ,其中包含PHP特有的答案和解决方法。 I am seeing this issue in C# and I would like to understand the logic behind this apparent "gotcha".
我在C#中看到了这个问题,我想了解这个明显的“陷阱”背后的逻辑。
The word boundary character \\b
doesn't seem to work properly when placed inside a Regex
set (aka "box brackets": []
). 当放置在正则
Regex
集(也称为“方括号”: []
)内时,单词边界字符\\b
似乎无法正常工作。 Is this a syntactic issue, are word boundaries intentionally excluded from set matching, or is there some other explanation I'm missing? 这是一个句法问题,是否有意将字边界排除在集合匹配之外,还是有其他一些我缺失的解释?
Here is a program to demonstrate the issue: 这是一个演示该问题的程序:
namespace TestProgram
{
using System.Text.RegularExpressions;
using System.Diagnostics;
class Program
{
static void Main(string[] args)
{
var text = "[abc]";
var BaselineRegex = new Regex(@"(?:\b)(abc)");
Debug.Assert(BaselineRegex.IsMatch(text)); // Assertion Passes
var BracketRegex = new Regex(@"(?:[\b])(abc)");
Debug.Assert(BracketRegex.IsMatch(text)); // Assertion Fails!
}
}
}
Here are web versions to demonstrate as well: 以下是用于演示的Web版本:
To quote Wiktor Stribiżew's comment : 引用WiktorStribiżew的评论 :
[\\b]
is a backspace char matching pattern, that is all.[\\b]
是退格字符匹配模式,即全部。
So while \\b
is a zero-width word boundary outside of a character class, it refers to the backspace character ( 0x8
in ASCII) when used within a character class. 因此,虽然
\\b
是字符类之外的零宽度字边界,但它在字符类中使用时指的是退格字符(ASCII中的0x8
)。 Further details are provided in this post . 中提供进一步的细节这篇文章 。
Wiktor: If you would like to post your own answer I would be happy to accept it over this one. Wiktor:如果您想发布自己的答案,我很乐意接受这个答案。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.