为什么\ g <0>在re.sub中的行为与\ 0不同？

Question

I'm using Python 3.3 我正在使用Python 3.3

re.sub("(.)(.)",r"\2\1\g<0>","ab")  returns baab

BUT 但

re.sub("(.)(.)",r"\2\1\0","ab")  returns ba

Is this a bug in the sub method or does the sub method not recognize \\0 on purpose for some reason? 这是sub方法中的错误还是由于某种原因sub方法无法识别\\ 0？

Answer 1

As written on this page , the \\0 is interpreted as the null character ( \\x00 ) and group number start at 1 in Python (according to the re module documentation): 如本页所示， \\0被解释为空字符（ \\x00 ），组编号从Python开始为1（根据re模块文档）：

\\number \\数

Matches the contents of the group of the same number. 匹配相同编号的组的内容。 Groups are numbered starting from 1 . 组从1开始编号 。 For example, (.+) \\1 matches 'the the' or '55 55', but not 'thethe' (note the space after the group). 例如， (.+) \\1匹配'the'或'55 55'，但不匹配'thethe'（注意组后面的空格）。 This special sequence can only be used to match one of the first 99 groups. 此特殊序列只能用于匹配前99个组中的一个。 If the first digit of number is 0 , or number is 3 octal digits long, it will not be interpreted as a group match , but as the character with octal value number. 如果数字的第一个数字是0 ，或者数字是3个八进制数字长， 则不会将其解释为组匹配 ，而是解释为具有八进制数值的字符。 Inside the '[' and ']' of a character class, all numeric escapes are treated as characters. 在字符类的'['和']'内，所有数字转义都被视为字符。

Also, according to the page previously linked, it's not a bug but a desired behaviour (this is obvious, since it's documented). 此外，根据之前链接的页面，它不是一个bug而是一个期望的行为（这很明显，因为它已被记录）。

Answer 2

\\0 is interpreted as an escape for null \\x00 , and re does not recognize it as a capture group. \\0被解释为null \\x00的转义\\x00 ，并且re不会将其识别为捕获组。

Reference: 参考：

Python Standard Library documentation Python标准库文档

为什么\ g <0>在re.sub中的行为与\ 0不同？

问题描述

2 个解决方案

解决方案1
9 已采纳 2014-01-09 19:06:30

解决方案2
1 2014-01-09 19:07:06

为什么\ g <0>在re.sub中的行为与\ 0不同？

问题描述

2 个解决方案

解决方案1 9 已采纳 2014-01-09 19:06:30

解决方案2 1 2014-01-09 19:07:06

解决方案1
9 已采纳 2014-01-09 19:06:30

解决方案2
1 2014-01-09 19:07:06