[英]Two Lists of strings: remove strings from list A that contain any string from list B?
I have two lists of strings. 我有两个字符串列表。
filters = ['foo', 'bar']
wordlist = ['hey', 'badge', 'foot', 'bar', 'cone']
I want to remove every word in the wordlist that contains a filter. 我想删除单词列表中包含过滤器的每个单词。
def filter_wordlist(filters, wordlist):
for word in wordlist:
if word contains any string from filters, remove it from the wordlist
return wordlist
So this filter function would return ['hey', 'badge', 'cone']
. 因此,此过滤器函数将返回
['hey', 'badge', 'cone']
。 It removed bar
because bar
is in filters
. 它删除了
bar
因为bar
在filters
。 It removed foot
because it contains the string foo
. 它删除了
foot
因为其中包含字符串foo
。
I tried this: 我尝试了这个:
for word in wordlist:
for f in filters:
if f in word:
wordlist.remove(word)
But it consistently returns ValueError: list.remove(x): x not in list
. 但是它始终返回
ValueError: list.remove(x): x not in list
。 So I tried wrapping it in a series of increasingly frustrating try/except blocks, but nothing down that gopher hole worked. 因此,我尝试将其包装在一系列越来越令人沮丧的try / except块中,但是在地鼠洞中没有任何作用。 I added a
break
statement below the remove command, but that was ... spotty. 我在remove命令下添加了
break
语句,但这很...参差不齐。 It seems like the items towards the end of the wordlist
aren't getting filtered properly. 似乎
wordlist
末尾的项目未正确过滤。
So I changed tactics to this: 所以我改变了策略:
for f in filters:
for word in wordlist:
if f in word:
wordlist.remove(word)
This is spotty just like before. 就像以前一样,这参差不齐。
So I tried this: 所以我尝试了这个:
for word in wordlist:
if any(f in word for f in filters):
wordlist.remove(word)
And now it's definitely irritating me. 现在,这肯定让我很恼火。 Spotty.
参差不齐。 And by now, I've realized what's happening - using
remove()
is changing the list as I'm iterating over it, and that's screwing up the iteration. 到现在为止,我已经意识到发生了什么-在我遍历列表时使用
remove()
更改了列表,这搞砸了迭代。
This seems like it should be really simple. 这似乎应该很简单。 I have two lists of strings.
我有两个字符串列表。 Take all of the items in List A. If any of those items contain any item from List B, remove the item from List A.
取出列表A中的所有项目。如果这些项目中的任何一个包含列表B中的任何项目,请从列表A中删除该项目。
This is the working solution I finally got: 这是我终于得到的有效解决方案:
keepitup = True
while keepitup:
start_length = len(wordlist)
for word in wordlist:
if any(f in word for f in filters):
wordlist.remove(link)
end_length = len(wordlist)
if start_length != end_length:
keepitup = True
else:
keepitup = False
This seems ridiculous. 这似乎很荒谬。 Surely there's a better way?
当然有更好的方法吗?
You could use a list comprehension: 您可以使用列表理解:
wordlist = [word for word in wordlist if all(f not in word for f in filters)]
Or the filter function: 或过滤功能:
filter(lambda word: all(f not in word for f in filters), wordlist)
Or you could iterate over a copy of wordlist: 或者您可以遍历单词表的副本:
for word in wordlist[:]:
if any(f in word for f in filters):
wordlist.remove(word)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.