简体   繁体   English

Python 中的正则表达式替换

[英]Regular Expression replacement in Python

I have a regular expression to match all instances of 1 followed by a letter.我有一个正则表达式来匹配 1 后跟一个字母的所有实例。 I would like to remove all these instances.我想删除所有这些实例。

EXPRESSION = re.compile(r"1([A-Z])")

I can use re.split.我可以使用 re.split。

result = EXPRESSION.split(input)

This would return a list.这将返回一个列表。 So we could do所以我们可以做

result = ''.join(EXPRESSION.split(input))

to convert it back to a string.将其转换回字符串。

or或者

result = EXPRESSION.sub('', input)

Are there any differences to the end result?最终结果有什么不同吗?

Yes, the results are different.是的,结果不一样。 Here is a simple example:这是一个简单的例子:

import re

EXPRESSION = re.compile(r"1([A-Z])")

s = 'hello1Aworld'

result_split = ''.join(EXPRESSION.split(s))
result_sub = EXPRESSION.sub('', s)

print('split:', result_split)
print('sub:  ', result_sub)

Output: Output:

split: helloAworld
sub:   helloworld

The reason is that because of the capture group, EXPRESSION.split(s) includes the A , as noted in the documentation:原因是由于捕获组, EXPRESSION.split(s)包括A ,如文档中所述:

re.split = split(pattern, string, maxsplit=0, flags=0)

Split the source string by the occurrences of the pattern, returning a list containing the resulting substrings.按模式的出现拆分源字符串,返回包含结果子字符串的列表。 If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list.如果在模式中使用捕获括号,则模式中所有组的文本也会作为结果列表的一部分返回。 If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list.如果 maxsplit 不为零,则最多发生 maxsplit 拆分,并将字符串的其余部分作为列表的最后一个元素返回。


When removing the capturing parentheses, ie, using删除捕获括号时,即使用

EXPRESSION = re.compile(r"1[A-Z]")

then so far I have not found a case where result_split and result_sub are different, even after reading this answer to a similar question about regular expressions in JavaScript , and changing the replacement string from '' to '-' .那么到目前为止,我还没有发现result_splitresult_sub不同的情况,即使在阅读了关于 JavaScript 中正则表达式的类似问题的答案并将替换字符串从''更改为'-'之后也是如此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM