简体   繁体   English

正则表达式模式中()和[]之间有什么区别?

[英]What's the difference between () and [] in regular expression patterns?

What is the difference between encasing part of a regular expression in () (parentheses) and doing it in [] (square brackets)? 在()(括号)中包含正则表达式的一部分与在[](方括号)中执行它之间有什么区别?

How does this: 这是怎么回事

[a-z0-9]

differ from this: 与此不同:

(a-z0-9)

?

[] denotes a character class. []表示一个字符类。 () denotes a capturing group. ()表示捕获组。

[a-z0-9] -- One character that is in the range of az OR 0-9 [a-z0-9] - 一个在az0-9范围内的字符

(a-z0-9) -- Explicit capture of a-z0-9 . (a-z0-9) - 明确捕获a-z0-9 No ranges. 没有范围。

a -- Can be captured by [a-z0-9] . a - 可以被[a-z0-9]捕获。

a-z0-9 -- Can be captured by (a-z0-9) and then can be referenced in a replacement and/or later in the expression. a-z0-9 - 可以由(a-z0-9)捕获,然后可以在表达式中替换和/或稍后引用。

(…) is a group that groups the contents like in math; (…)是一个像数学一样对内容进行分组的组; (a-z0-9) is the grouped sequence of a-z0-9 . (a-z0-9)是分组序列a-z0-9 Groups are particularly used with quantifiers that allow the preceding expression to be repeated as a whole: a*b* matches any number of a 's followed by any number of b 's, eg a , aaab , bbbbb , etc.; 组特别与量词一起使用 ,允许前面的表达式作为一个整体重复: a*b*匹配任意数量的a ,后跟任意数量的b ,例如aaaabbbbbb等; in contrast to that, (ab)* matches any number of ab 's, eg ab , abababab , etc. 与此相反, (ab)*匹配任意数量的ab ,例如ababababab等。

[…] is a character class that describes the options for one single character; […]是一个描述单个字符选项的字符类 ; [a-z0-9] describes one single character that can be of the range az or 09 . [a-z0-9]描述了一个单独的字符,可以是该范围的a - z0 - 9

The [] construct in a regex is essentially shorthand for an | 正则表达式中的[]构造基本上是|简写 on all of the contents. 关于所有内容。 For example [abc] matches a, b or c. 例如[abc]匹配a,b或c。 Additionally the - character has special meaning inside of a [] . 另外, -字符在[]内部具有特殊含义。 It provides a range construct. 它提供了范围构造。 The regex [az] will match any letter a through z. 正则表达式[az]将匹配任何字母a到z。

The () construct is a grouping construct establishing a precedence order (it also has impact on accessing matched substrings but that's a bit more of an advanced topic). ()构造是一个建立优先顺序的分组构造(它对访问匹配的子串也有影响,但这更像是一个高级主题)。 The regex (abc) will match the string "abc". 正则表达式(abc)将匹配字符串“abc”。

[a-z0-9] will match any lowercase letter or number. [a-z0-9]将匹配任何小写字母或数字。 (a-z0-9) will match the exact string "a-z0-9" and allows two additional things: You can apply modifiers like * and ? (a-z0-9)将匹配确切的字符串"a-z0-9"并允许另外两件事:你可以应用像*?这样的修饰符 and + to the whole group, and you can reference this match after the match with $1 or \\1 . +到整个组,您可以在匹配后使用$1\\1引用此匹配。 Not useful with your example, though. 但是,对你的例子没用。

尝试([a-z0-9])捕获小写字母和数字的混合字符串,以及捕获后引用(或提取)。

[a-z0-9] will match one of abcdefghijklmnopqrstuvwxyz0123456789 . [a-z0-9]将匹配abcdefghijklmnopqrstuvwxyz0123456789中的一个。 In other words, square brackets match exactly one character. 换句话说,方括号恰好匹配一个字符。

(a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz , the second is one of 0123456789 , just as if the parenthesis weren't there. (a-z0-9)将匹配两个字符,第一个是abcdefghijklmnopqrstuvwxyz ,第二个是0123456789 ,就好像括号不在那里一样。 The () will allow you to read exactly which characters were matched. ()将允许您准确读取匹配的字符。 Parenthesis are also useful for OR'ing two expressions with the bar | 括号也可用于使用bar |对两个表达式进行OR运算 character. 字符。 For example, (az|0-9) will match one character -- any of the lowercase alpha or digit. 例如, (az|0-9)将匹配一个字符 - 任何小写字母或数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM