简体   繁体   English

从python中的字符串中过滤一组子字符串的最快方法是什么?

[英]What is the fastest way to filter a set of substrings from a string in python?

I'm looking at a statement that looks like this: 我正在看一个看起来像这样的语句:

def fn(somelongstring):
    shorterstring = somelongstring.replace('very, ','').replace('long ', '')

fn('some very, very, very, long string')

what's the most efficient method for performing this kind of operation in Python? 在Python中执行这种操作的最有效方法是什么?


Some notes: 一些注意事项:

  • The list of replace calls is quite long, but fixed and known in advance 替换调用的列表很长,但是已经确定并且事先知道
  • The long string is an argument to the function, and can get massive; 长字符串是该函数的一个参数,并且可以变得很大; it includes repetitions of the substrings 它包括子字符串的重复
  • My intuition is that deletion has the opportunity to use different, faster, algorithms from replace 我的直觉是,删除有机会使用替换中使用的不同,更快的算法
  • The chained replace calls are probably each iterating over the string. 链接的替换调用可能每个都在字符串上进行迭代。 There has to be a way to do this without all those repeated iterations. 必须有一种方法来消除所有这些重复的迭代。

Use an re: 使用重新:

import re
shorterstring = re.sub('very, |long ', '', 'some very, very, very, long string')

You'll need to make sure that the substrings to replace with nothing are in descending order of length so that longer matches are replaced first. 您需要确保不替换任何内容的子字符串按长度降序排列,以便较长的匹配项首先被替换。

Or, you could avoid the chained calls, and use: 或者,您可以避免链式调用,并使用:

reduce(lambda a, b: a.replace(b, ''), ['very, ', 'long '], s)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从Python中的字符串创建重叠子串列表的最快方法 - Fastest way to create a list of overlapping substrings from a string in Python 计算字符串Python3.6中子串实例的最快方法 - Fastest way to count instances of substrings in string Python3.6 在Python中过滤JSON字符串的最快方法 - Fastest way to filter json string in Python 在字符串中查找多个子字符串之一的最快方法 - fastest way to find one of several substrings in string 用Python扫描一组URL的最快方法是什么? - What is the fastest way to scan a set of URLs in Python? 在python中将字符串转换为数组的最快方法是什么? - What is the fastest way to convert string to array in python? 什么是最快的算法:在字符串列表中,删除作为另一个字符串的子字符串的所有字符串 [Python(或其他语言)] - What is the fastest algorithm: in a string list, remove all the strings which are substrings of another string [Python (or other language)] 从PHP执行Python的最快方法是什么? - What is the fastest way to execute Python from PHP? python中字符串的子字符串 - Substrings from string in python 用多个条件从熊猫数据框中匹配,替换和提取子字符串的最快方法是什么? - What is the fastest way to match, replace, and extract substrings from pandas dataframe with multiple criteria?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM