简体   繁体   English

Python:计算并替换同一次正则表达式?

[英]Python: Count and replace regular expression in same pass?

I can globally replace a regular expression with re.sub() , and I can count matches with我可以用re.sub()全局替换正则表达式,我可以用

for match in re.finditer(): count++

Is there a way to combine these two, so that I can count my substitutions without making two passes through the source string?有没有办法将这两者结合起来,这样我就可以计算我的替换而无需两次遍历源字符串?

Note: I'm not interested in whether the substitution matched, I'm interested in the exact count of matches in the same call, avoiding one call to count and one call to substitute.注意:我对替换是否匹配不感兴趣,我对同一调用中匹配的确切计数感兴趣,避免一次调用计数和一次调用替换。

You can pass a repl function while calling the re.sub function.您可以在调用re.sub函数时传递repl函数。 The function takes a single match object argument, and returns the replacement string.该函数采用单个匹配对象参数,并返回替换字符串。 The repl function is called for every non-overlapping occurrence of pattern. repl函数为每一个不重叠的模式被调用。

Try this:尝试这个:

count = 0
def count_repl(mobj): # --> mobj is of type re.Match
    global count
    count += 1 # --> count the substitutions
    return "your_replacement_string" # --> return the replacement string

text = "The original text" # --> source string
new_text = re.sub(r"pattern", repl=count_repl, string=text) # count and replace the matching occurrences in one pass.

OR,或者,

You can use re.subn which performs the same operation as re.sub , but return a tuple (new_string, number_of_subs_made).您可以使用re.subn执行与re.sub相同的操作,但返回一个元组 (new_string, number_of_subs_made)。

new_text, count = re.sub(r"pattern", repl="replacement", string=text)

Example:例子:

count = 0
def count_repl(mobj):
    global count
    count += 1
    return f"ID: {mobj.group(1)}"

text = "Jack 10, Lana 11, Tom 12, Arthur, Mark"
new_text = re.sub(r"(\d+)", repl=count_repl, string=text)

print(new_text)
print("Number of substitutions:", count)

Output:输出:

Jack ID: 10, Lana ID: 11, Tom ID: 12
Number of substitutions: 3

You can use re.subn .您可以使用re.subn

re.subn(pattern, repl, string, count=0, flags=0)

it returns (new_string, number_of_subs_made)它返回(new_string, number_of_subs_made)

For example purposes, I'm using the same example as @Shubham Sharma used.出于示例目的,我使用的示例与 @Shubham Sharma 使用的示例相同。

text = "Jack 10, Lana 11, Tom 12, Arthur, Mark"
out_str,count=re.subn(r"(\d+)", repl='replacement', string=text)

#out_str-->'Jack replacement, Lana replacement, Tom replacement, Arthur, Mark'
#count---> 3
import re


text = "Jack 10, Lana 11, Tom 12"
count = len([x for x in re.finditer(r"(\d+)", text)])
print(count)

# Output: 3

Ok, there's a better way好的,有更好的方法

import re


text = "Jack 10, Lana 11, Tom 12"
count = re.subn(r"(\d+)", repl="replacement", string=text)[1]
print(count)

# Output: 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM