Python正则表达式替换不能像我期望的那样工作

Question

I am trying to create a regular expression to replace part of a string. 我正在尝试创建一个正则表达式来替换字符串的一部分。 This is an example of the string: 这是字符串的示例：

string = u'/nl/nl/1681/1/0/a/all/'
pattern = r'(/\w{2}/\w{2}/)(\d+)/(\d+)(/\d+/[ans]/all/)'
pattern_obj = re.compile(pattern)

The pattern specifies 4 groups. 该模式指定了4个组。 If you do a search then the results are as follows: 如果您进行搜索，则结果如下：

m = pattern_obj.search(string)
m.group(0) -> u'/nl/nl/1681/1/0/a/all/'
m.group(1) -> u'/nl/nl/'
m.group(2) -> u'1681'
m.group(2) -> u'1'
m.group(4) -> u'/0/a/all/'

So far so good. 到现在为止还挺好。 Now I specify a replacement string as follows: 现在我指定一个替换字符串，如下所示：

replacement = r'\1' + '1000' + '/' + '20' + r'\4'

and issue the following statement: 并发出以下声明：

pattern_obj.sub(replacement,string)

and this results in: 这导致：

u'H00/20/0/a/all/'

I expected this: 我期待这个：

u'/nl/nl/1000/20/0/a/all/'

I must be doing something wrong but I don't know what. 我一定做错了，但我不知道是什么。 Can anybody help me out? 有人可以帮帮我吗？

Answer 1

Your replacement string, when it's fully assembled, is \\11000/20\\4 and \\110 gets interpreted as the octal escape for H rather than a back-reference to group number 1 followed by 10 . 完全组装时，替换字符串为\\11000/20\\4 ， \\110被解释为H的八进制转义，而不是对组号1后跟10的后引用。

You need to write \\g<1> instead of \\1 to make sure that it's unambiguously a back-reference. 你需要写\\g<1>而不是\\1来确保它明确地是一个反向引用。 See the documentation for re.sub . 请参阅re.sub的文档。

Python正则表达式替换不能像我期望的那样工作

问题描述

1 个解决方案

解决方案1
3 2012-09-13 16:34:54

Python正则表达式替换不能像我期望的那样工作

问题描述

1 个解决方案

解决方案1 3 2012-09-13 16:34:54

解决方案1
3 2012-09-13 16:34:54