从python中的字符串中过滤一组子字符串的最快方法是什么？

Question

I'm looking at a statement that looks like this: 我正在看一个看起来像这样的语句：

def fn(somelongstring):
    shorterstring = somelongstring.replace('very, ','').replace('long ', '')

fn('some very, very, very, long string')

what's the most efficient method for performing this kind of operation in Python? 在Python中执行这种操作的最有效方法是什么？

Some notes: 一些注意事项：

The list of replace calls is quite long, but fixed and known in advance 替换调用的列表很长，但是已经确定并且事先知道
The long string is an argument to the function, and can get massive; 长字符串是该函数的一个参数，并且可以变得很大； it includes repetitions of the substrings 它包括子字符串的重复
My intuition is that deletion has the opportunity to use different, faster, algorithms from replace 我的直觉是，删除有机会使用替换中使用的不同，更快的算法
The chained replace calls are probably each iterating over the string. 链接的替换调用可能每个都在字符串上进行迭代。 There has to be a way to do this without all those repeated iterations. 必须有一种方法来消除所有这些重复的迭代。

Answer 1

Use an re: 使用重新：

import re
shorterstring = re.sub('very, |long ', '', 'some very, very, very, long string')

You'll need to make sure that the substrings to replace with nothing are in descending order of length so that longer matches are replaced first. 您需要确保不替换任何内容的子字符串按长度降序排列，以便较长的匹配项首先被替换。

Or, you could avoid the chained calls, and use: 或者，您可以避免链式调用，并使用：

reduce(lambda a, b: a.replace(b, ''), ['very, ', 'long '], s)

从python中的字符串中过滤一组子字符串的最快方法是什么？

问题描述

1 个解决方案

解决方案1
3 2013-09-13 14:40:55

从python中的字符串中过滤一组子字符串的最快方法是什么？

问题描述

1 个解决方案

解决方案1 3 2013-09-13 14:40:55

解决方案1
3 2013-09-13 14:40:55