简体   繁体   English

为什么\ g <0>在re.sub中的行为与\ 0不同?

[英]Why \g<0> behaves differently than \0 in re.sub?

I'm using Python 3.3 我正在使用Python 3.3

re.sub("(.)(.)",r"\2\1\g<0>","ab")  returns baab

BUT

re.sub("(.)(.)",r"\2\1\0","ab")  returns ba

Is this a bug in the sub method or does the sub method not recognize \\0 on purpose for some reason? 这是sub方法中的错误还是由于某种原因sub方法无法识别\\ 0?

As written on this page , the \\0 is interpreted as the null character ( \\x00 ) and group number start at 1 in Python (according to the re module documentation): 本页所示\\0被解释为空字符( \\x00 ),组编号从Python开始为1(根据re模块文档):

\\number \\数

Matches the contents of the group of the same number. 匹配相同编号的组的内容。 Groups are numbered starting from 1 . 从1开始编号 For example, (.+) \\1 matches 'the the' or '55 55', but not 'thethe' (note the space after the group). 例如, (.+) \\1匹配'the'或'55 55',但不匹配'thethe'(注意组后面的空格)。 This special sequence can only be used to match one of the first 99 groups. 此特殊序列只能用于匹配前99个组中的一个。 If the first digit of number is 0 , or number is 3 octal digits long, it will not be interpreted as a group match , but as the character with octal value number. 如果数字的第一个数字是0 ,或者数字是3个八进制数字长, 则不会将其解释为组匹配 ,而是解释为具有八进制数值的字符。 Inside the '[' and ']' of a character class, all numeric escapes are treated as characters. 在字符类的'['和']'内,所有数字转义都被视为字符。

Also, according to the page previously linked, it's not a bug but a desired behaviour (this is obvious, since it's documented). 此外,根据之前链接的页面,它不是一个bug而是一个期望的行为(这很明显,因为它已被记录)。

\\0 is interpreted as an escape for null \\x00 , and re does not recognize it as a capture group. \\0被解释为null \\x00的转义\\x00 ,并且re不会将其识别为捕获组。

Reference: 参考:

Python Standard Library documentation Python标准库文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM