简体   繁体   English

有没有办法在Python的re.sub()中的替换字符串中使用正则表达式?

[英]Is there a way to use regular expressions in the replacement string in re.sub() in Python?

In Python in the re module there is the following function: re模块的Python中,有以下功能:

re.sub(pattern, repl, string, count=0, flags=0) – Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. re.sub(pattern,repl,string,count = 0,flags = 0) - 返回通过替换repl替换字符串中最左边非重叠模式而获得的字符串。 If the pattern isn't found, string is returned unchanged. 如果未找到模式,则返回字符串不变。

I've found it can work like this: 我发现它可以像这样工作:

print re.sub('[a-z]*\d+','lion','zebra432') # prints 'lion'

I was wondering, is there an easy way to use regular expressions in the replacement string, so that the replacement string contains part of the original regular expression/original string? 我想知道,是否有一种简单的方法在替换字符串中使用正则表达式,以便替换字符串包含原始正则表达式/原始字符串的一部分? Specifically, can I do something like this (which doesn't work)? 具体来说,我可以做这样的事情(这不起作用)?

print re.sub('[a-z]*\d+', 'lion\d+', 'zebra432')

I want that to print 'lion432' . 我想要打印'lion432' Obviously, it does not. 显然,它没有。 Rather, it prints 'lion\\d+' . 相反,它打印'lion\\d+' Is there an easy way to use parts of the matching regular expression in the replacement string? 有没有一种简单的方法可以在替换字符串中使用匹配正则表达式的部分?

By the way, this is NOT a special case. 顺便说一句,这不是一个特例。 Please do NOT assume that the number will always come at the end, the words will always come in the beginning, etc. I want to know a solution to all regexes in general. 请不要认为数字总是在最后,单词总是在开头,等等。我想知道一般的所有正则表达式的解决方案。

Thanks 谢谢

Place \\d+ in a capture group (...) and then use \\1 to refer to it: \\d+放在捕获组(...) ,然后使用\\1来引用它:

>>> import re
>>> re.sub('[a-z]*(\d+)', r'lion\1', 'zebra432')
'lion432'
>>>
>>> # You can also refer to more than one capture group
>>> re.sub('([a-z]*)(\d+)', r'\1lion\2', 'zebra432')
'zebralion432'
>>>

From the docs : 来自文档

Backreferences, such as \\6 , are replaced with the substring matched by group 6 in the pattern. 反序列(例如\\6 )将替换为模式中第6组匹配的子字符串。

Note that you will also need to use a raw-string so that \\1 is not treated as an escape sequence. 请注意,您还需要使用原始字符串,以便不将\\1视为转义序列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM