[英]Delete substrings from a list of strings
I have a list我有一个清单
l = ['abc', 'abcdef', 'def', 'defdef', 'polopolo']
I'm trying to delete strings whose superstring is already in the list.我正在尝试删除其超字符串已在列表中的字符串。 In this case, the result should be:
在这种情况下,结果应该是:
['abcdef', 'defdef', 'polopolo']
I have written the code:我已经写了代码:
l=['abc','abcdef','def','defdef','polopolo']
res=['abc','abcdef','def','defdef','polopolo']
for each in l:
l1=[x for x in l if x!=each]
for other in l1:
if each in other:
res.remove(each)
but it doesnt seem to work.但它似乎不起作用。 I have read that we cannot remove from the list while iterating over it.
我已经读过我们不能在迭代时从列表中删除它。 Hence the copy
res
, while l
is my original list.因此副本
res
,而l
是我的原始列表。
l=['abc','abcdef','def','defdef','polopolo']
print [j for i, j in enumerate(l) if all(j not in k for k in l[i + 1:])]
# ['abcdef', 'defdef', 'polopolo']
We can speed it up a very little, by sorting the list before我们可以通过在之前对列表进行排序来加快速度
l = sorted(l, key = len)
print [j for i, j in enumerate(l) if all(j not in k for k in l[i + 1:])]
As @Ashwini Chaudhary mentions in the comments , if you want to retain the duplicate strings, then you can do this正如@Ashwini Chaudhary 在评论中提到的,如果你想保留重复的字符串,那么你可以这样做
l = ['abc','defghi' 'abcdef','def','defdef','defdef', 'polopolo']
l = sorted(l, key = len)
print [j for i,j in enumerate(l) if all(j == k or (j not in k) for k in l[i+1:])]
# ['defdef', 'defdef', 'polopolo', 'defghiabcdef']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.