简体   繁体   中英

Two Lists of strings: remove strings from list A that contain any string from list B?

I have two lists of strings.

filters = ['foo', 'bar']

wordlist = ['hey', 'badge', 'foot', 'bar', 'cone']

I want to remove every word in the wordlist that contains a filter.

def filter_wordlist(filters, wordlist):

    for word in wordlist:
        if word contains any string from filters, remove it from the wordlist

    return wordlist

So this filter function would return ['hey', 'badge', 'cone'] . It removed bar because bar is in filters . It removed foot because it contains the string foo .

I tried this:

for word in wordlist:
    for f in filters:
        if f in word:
            wordlist.remove(word)

But it consistently returns ValueError: list.remove(x): x not in list . So I tried wrapping it in a series of increasingly frustrating try/except blocks, but nothing down that gopher hole worked. I added a break statement below the remove command, but that was ... spotty. It seems like the items towards the end of the wordlist aren't getting filtered properly.

So I changed tactics to this:

for f in filters:
    for word in wordlist:
        if f in word:
            wordlist.remove(word)

This is spotty just like before.

So I tried this:

for word in wordlist:
    if any(f in word for f in filters):
        wordlist.remove(word)

And now it's definitely irritating me. Spotty. And by now, I've realized what's happening - using remove() is changing the list as I'm iterating over it, and that's screwing up the iteration.

This seems like it should be really simple. I have two lists of strings. Take all of the items in List A. If any of those items contain any item from List B, remove the item from List A.

This is the working solution I finally got:

keepitup = True

while keepitup:
    start_length = len(wordlist)
    for word in wordlist:
        if any(f in word for f in filters):
            wordlist.remove(link)
    end_length = len(wordlist)
    if start_length != end_length:
        keepitup = True
    else:
        keepitup = False

This seems ridiculous. Surely there's a better way?

You could use a list comprehension:

wordlist = [word for word in wordlist if all(f not in word for f in filters)]

Or the filter function:

filter(lambda word: all(f not in word for f in filters), wordlist)

Or you could iterate over a copy of wordlist:

for word in wordlist[:]:
    if any(f in word for f in filters):
        wordlist.remove(word)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM