[英]Python: Count and replace regular expression in same pass?
I can globally replace a regular expression with re.sub()
, and I can count matches with我可以用re.sub()
全局替换正则表达式,我可以用
for match in re.finditer(): count++
Is there a way to combine these two, so that I can count my substitutions without making two passes through the source string?有没有办法将这两者结合起来,这样我就可以计算我的替换而无需两次遍历源字符串?
Note: I'm not interested in whether the substitution matched, I'm interested in the exact count of matches in the same call, avoiding one call to count and one call to substitute.注意:我对替换是否匹配不感兴趣,我对同一调用中匹配的确切计数感兴趣,避免一次调用计数和一次调用替换。
You can pass a repl
function while calling the re.sub
function.您可以在调用re.sub
函数时传递repl
函数。 The function takes a single match object argument, and returns the replacement string.该函数采用单个匹配对象参数,并返回替换字符串。 The repl
function is called for every non-overlapping occurrence of pattern. repl
函数为每一个不重叠的模式被调用。
Try this:尝试这个:
count = 0
def count_repl(mobj): # --> mobj is of type re.Match
global count
count += 1 # --> count the substitutions
return "your_replacement_string" # --> return the replacement string
text = "The original text" # --> source string
new_text = re.sub(r"pattern", repl=count_repl, string=text) # count and replace the matching occurrences in one pass.
OR,或者,
You can use re.subn which performs the same operation as re.sub , but return a tuple (new_string, number_of_subs_made).您可以使用re.subn执行与re.sub相同的操作,但返回一个元组 (new_string, number_of_subs_made)。
new_text, count = re.sub(r"pattern", repl="replacement", string=text)
Example:例子:
count = 0
def count_repl(mobj):
global count
count += 1
return f"ID: {mobj.group(1)}"
text = "Jack 10, Lana 11, Tom 12, Arthur, Mark"
new_text = re.sub(r"(\d+)", repl=count_repl, string=text)
print(new_text)
print("Number of substitutions:", count)
Output:输出:
Jack ID: 10, Lana ID: 11, Tom ID: 12
Number of substitutions: 3
You can use re.subn
.您可以使用re.subn
。
re.subn(pattern, repl, string, count=0, flags=0)
it returns (new_string, number_of_subs_made)
它返回(new_string, number_of_subs_made)
For example purposes, I'm using the same example as @Shubham Sharma used.出于示例目的,我使用的示例与 @Shubham Sharma 使用的示例相同。
text = "Jack 10, Lana 11, Tom 12, Arthur, Mark"
out_str,count=re.subn(r"(\d+)", repl='replacement', string=text)
#out_str-->'Jack replacement, Lana replacement, Tom replacement, Arthur, Mark'
#count---> 3
import re
text = "Jack 10, Lana 11, Tom 12"
count = len([x for x in re.finditer(r"(\d+)", text)])
print(count)
# Output: 3
import re
text = "Jack 10, Lana 11, Tom 12"
count = re.subn(r"(\d+)", repl="replacement", string=text)[1]
print(count)
# Output: 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.