[英]Regular Expression Groups in C#
I've inherited a code block that contains the following regex and I'm trying to understand how it's getting its results.我继承了一个包含以下正则表达式的代码块,我试图了解它是如何获得结果的。
var pattern = @"\[(.*?)\]";
var matches = Regex.Matches(user, pattern);
if (matches.Count > 0 && matches[0].Groups.Count > 1)
...
For the input user == "Josh Smith [jsmith]"
:对于输入
user == "Josh Smith [jsmith]"
:
matches.Count == 1
matches[0].Value == "[jsmith]"
... which I understand. ……我明白了。 But then:
但是之后:
matches[0].Groups.Count == 2
matches[0].Groups[0].Value == "[jsmith]"
matches[0].Groups[1].Value == "jsmith" <=== how?
Looking at this question from what I understand the Groups collection stores the entire match as well as the previous match.从我的理解来看这个问题,Groups 集合存储了整个比赛以及上一场比赛。 But, doesn't the regexp above match only for [open square bracket] [text] [close square bracket] so why would "jsmith" match?
但是,上面的正则表达式不只匹配 [open square bracket] [text] [close square bracket] 那么为什么“jsmith”会匹配?
Also, is it always the case the the groups collection will store exactly 2 groups: the entire match and the last match?此外,groups 集合是否总是存储 2 个组:整个匹配和最后一个匹配?
match.Groups[0]
is always the same as match.Value
, which is the entire match. match.Groups[0]
始终与match.Value
相同,后者是整个匹配项。 match.Groups[1]
is the first capturing group in your regular expression. match.Groups[1]
是正则表达式中的第一个捕获组。 Consider this example: 考虑以下示例:
var pattern = @"\[(.*?)\](.*)";
var match = Regex.Match("ignored [john] John Johnson", pattern);
In this case, 在这种情况下,
match.Value
is "[john] John Johnson"
match.Value
是"[john] John Johnson"
match.Groups[0]
is always the same as match.Value
, "[john] John Johnson"
. match.Groups[0]
始终与match.Value
"[john] John Johnson"
。 match.Groups[1]
is the group of captures from the (.*?)
. match.Groups[1]
是(.*?)
中捕获的组。 match.Groups[2]
is the group of captures from the (.*)
. match.Groups[2]
是(.*)
中捕获的组。 match.Groups[1].Captures
is yet another dimension. match.Groups[1].Captures
是另一个维度。 Consider another example: 考虑另一个示例:
var pattern = @"(\[.*?\])+";
var match = Regex.Match("[john][johnny]", pattern);
Note that we are looking for one or more bracketed names in a row. 请注意,我们正在连续查找一个或多个带括号的名称。 You need to be able to get each name separately.
您需要能够分别获得每个名称。 Enter
Captures
! 输入
Captures
!
match.Groups[0]
is always the same as match.Value
, "[john][johnny]"
. match.Groups[0]
始终与match.Value
, "[john][johnny]"
。 match.Groups[1]
is the group of captures from the (\\[.*?\\])+
. match.Groups[1]
是(\\[.*?\\])+
中捕获的组。 The same as match.Value
in this case. match.Value
相同。 match.Groups[1].Captures[0]
is the same as match.Groups[1].Value
match.Groups[1].Captures[0]
与match.Groups[1].Value
match.Groups[1].Captures[1]
is [john]
match.Groups[1].Captures[1]
是[john]
match.Groups[1].Captures[2]
is [johnny]
match.Groups[1].Captures[2]
是[johnny]
The ( )
acts as a capture group. ( )
充当捕获组。 So the matches array has all of matches that C# finds in your string and the sub array has the values of the capture groups inside of those matches. 因此matchs数组具有C#在您的字符串中找到的所有匹配项,而sub数组具有这些匹配项内的捕获组的值。 If you didn't want that extra level of capture jut remove the
( )
. 如果您不想增加捕获级别,请删除
( )
。
括号也标识一个组,因此匹配项1是整个匹配项,而匹配项2是在方括号之间找到的内容。
How? 怎么样? The answer is here
答案在这里
(.*?)
That is a subgroup of @"[(.*?)]; 那是@“ [(。*?)];的子组;
Groups[0]
is your entire input string. Groups[0]
是您的整个输入字符串。
Groups[1]
is your group captured by parentheses (.*?)
. Groups[1]
是用括号(.*?)
捕获的组。 You can configure Regex to capture Explicit groups only (there is an option for that when you create a regex), or use (?:.*?)
to create a non-capturing group. 您可以将正则表达式配置为仅捕获显式组(创建正则表达式时有一个选项),或使用
(?:.*?)
创建不捕获组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.