[英]Is this result for regular expression backreferencing correct?
I used Javascript in the command line client of MongoDB v2.2.4 to run the following regular expression backreferencing: 我在MongoDB v2.2.4的命令行客户端中使用Javascript运行以下正则表达式反向引用:
> /([AB])([AB])/("BA")
[ "BA", "B", "A" ]
I had thought I should get ["B","A"] but I got an extra element "BA" at the beginning of the array. 我以为我应该得到[“ B”,“ A”],但是在数组的开头我得到了一个额外的元素“ BA”。 I tried the same regular expression backreferencing in Python, the returning results is what I expected as follows:
我在Python中尝试了相同的正则表达式反向引用,返回的结果如下所示:
>>> re.search('([AB])([AB])','BA').groups()
('B', 'A')
So, may I say the result of the regular expression backreferencing from Javascript in MongoDB is wrong? 因此,我可以说MongoDB中的Javascript正则表达式反向引用的结果是错误的吗?
在JavaScript(以及许多其他Regex引擎)中,组0被视为整个输入,而匹配组从1开始。在Python的re模块中,组从0开始,因为整个字符串都是您的输入。
The MongoDB result includes the whole matched string, or group 0, as well as groups 1 and 2. MongoDB结果包括整个匹配的字符串,即组0,以及组1和2。
The Python .groups()
method only returns captured groups. Python
.groups()
方法仅返回捕获的组。 The .group()
method would, without an argument, return group 0 too: .group()
方法也将在不带参数的情况下返回组0:
>>> re.search('([AB])([AB])', 'BA').groups()
('B', 'A')
>>> re.search('([AB])([AB])', 'BA').group()
'BA'
>>> re.search('([AB])([AB])', 'BA').group(1)
'B'
>>> re.search('([AB])([AB])', 'BA').group(2)
'A'
>>> re.search('([AB])([AB])', 'BA').group(0)
'BA'
This is documented in the re
module documentation : 这在
re
模块文档中有所记录 :
Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.
返回一个包含匹配项所有子组的元组,从1到模式中的许多组。
and for the .group()
method: 对于
.group()
方法:
Returns one or more subgroups of the match.
返回匹配项的一个或多个子组。 If there is a single argument, the result is a single string;
如果有单个参数,则结果为单个字符串; if there are multiple arguments, the result is a tuple with one item per argument.
如果有多个参数,则结果是一个元组,每个参数有一个项目。 Without arguments, group1 defaults to zero (the whole match is returned).
没有参数, group1默认为零(返回整个匹配项)。
Note that there are no back-references in your expression. 请注意,表达式中没有反向引用。 A back-reference would look like this instead:
反向引用看起来像这样:
'([AB])\1'
where the \\1
refers to the capturing group just before it. \\1
指的是捕获组之前的捕获组。 The back-reference will only match the exact same characters that the referenced group matched. 反向引用将仅匹配被引用组匹配的完全相同的字符。
Demo: 演示:
>>> re.search(r'([AB])\1', 'BA')
>>> re.search(r'([AB])\1', 'BB')
<_sre.SRE_Match object at 0x107098210>
Note how only BB
is matched, not BA
. 注意如何只匹配
BB
而不匹配BA
。
You can use named groups too: 您也可以使用命名组:
'(?P<a_or_b>[AB])(?P=a_or_b)'
where a_or_b
is the group name. 其中
a_or_b
是组名。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.