简体   繁体   English

如何从字符串列表中删除子字符串?

[英]How to remove a substrings from a list of strings?

I have a list of strings, all of which have a common property, they all go like this "pp:actual_string" .我有一个字符串列表,所有这些都有一个共同的属性,它们都是 go 像这样"pp:actual_string" I do not know for sure what the substring "pp:" will be, basically : acts as a delimiter;我不确定 substring "pp:"将是什么,基本上:充当分隔符; everything before : shouldn't be included in the result.之前的所有内容:不应包含在结果中。

I have solved the problem using the brute force approach, but I would like to see a clever method, maybe something like regex.我已经使用蛮力方法解决了这个问题,但我想看看一个聪明的方法,也许像正则表达式。

Note: Some strings might not have this "pp:string" format, and could be already a perfect string, ie without the delimiter.注意:有些字符串可能没有这种"pp:string"格式,并且可能已经是一个完美的字符串,即没有分隔符。

This is my current solution:这是我目前的解决方案:

ll = ["pp17:gaurav","pp17:sauarv","pp17:there","pp17:someone"]
res=[]
for i in ll:
    g=""
    for j in range(len(i)):
        if i[j] == ':':
            index=j+1
    res.append(i[index:len(i)])

print(res)

Is there a way that I can do it without creating an extra list?有没有一种方法可以在不创建额外列表的情况下做到这一点?

Here are a few options, based upon different assumptions.这里有几个选项,基于不同的假设。

Most explicit最明确的

if s.startswith('pp:'):
    s = s[len('pp:'):]  # aka 3

If you want to remove anything before the first :如果您想在第一个之前删除任何内容:

s = s.split(':', 1)[-1]

Regular expressions:常用表达:

Same as startswith与开始相同

s = re.sub('^pp:', '', s)

Same as split, but more careful with 'pp:' and slower与 split 相同,但更小心 'pp:' 并且速度较慢

s = re.match('(?:^pp:)?(.*)', s).group(1)

Whilst regex is an incredibly powerful tool with a lot of capabilities, using a "clever method" is not necessarily the best idea you are unfamiliar with its principles.虽然正则表达式是一个非常强大的工具,具有很多功能,但使用“聪明的方法”不一定是您不熟悉其原理的最佳主意。

Your problem is one that can be solved without regex by splitting on the : character using the str.split() method, and just returning the last part by using the [-1] index value to represent the last (or only) string that results from the split.您的问题可以通过使用str.split()方法拆分:字符并通过使用[-1]索引值返回最后一部分来表示最后(或唯一)字符串分裂的结果。 This will work even if there isn't a : .即使没有: ,这也会起作用。

list_with_prefixes = ["pp:actual_string", "perfect_string", "frog:actual_string"]

cleaned_list = [x.split(':')[-1] for x in list_with_prefixes]
print(cleaned_list)

This is a list comprehension that takes each of the strings in turn ( x ), splits the string on the : character, this returns a list containing the prefix (if it exists) and the suffix, and builds a new list with only the suffix (ie item [-1] in the list that results from the split. In this example, it returns:这是一个列表推导式,依次获取每个字符串 ( x ),在:字符上拆分字符串,返回一个包含前缀(如果存在)和后缀的列表,并构建一个仅包含后缀的新列表(即拆分结果列表中的项目 [-1]。在此示例中,它返回:

['actual_string', 'perfect_string', 'actual_string']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM