简体   繁体   中英

Remove multiple string elements from a list which contains certain keywords

I have a list called files consists of string elements:

files = ['Upside Your Head.txt', 'The Mighty Quinn - [Remastered].txt', 'The Mighty Quinn - (live).txt', 'Fixin To Die (Mono version).txt', '10,000 Men - [Remastered].txt', '10,000 Men.txt', '10.000 Men - (live).txt']

I am trying to remove elements that contain specific keywords (eg Live, Remastered, Mono). The code I wrote was like this:

files = [i for i in files if '(Live)' not in i]
files = [i for i in files if '[Remastered]' not in i]
files = [i for i in files if '(Mono' not in i]

What is a better practice to include all the above three lines in one statement? Considering that I want to add more keywords later.

You can try this with a list comprehension for readibility -

files = ['Upside Your Head.txt', 
         'The Mighty Quinn - [Remastered].txt', 
         'The Mighty Quinn - (live).txt', 
         'Fixin To Die (Mono version).txt', 
         '10,000 Men - [Remastered].txt', 
         '10,000 Men.txt', 
         '10.000 Men - (live).txt']

rem = ['(live)','[Remastered]','(Mono']

[f for f in files if all(r not in f for r in rem)]
['Upside Your Head.txt', '10,000 Men.txt']

This is the same as this -

[f for f in files if not any(r in f for r in rem)]
['Upside Your Head.txt', '10,000 Men.txt']

The first step to generalize this is to combine multiple conditions with logical operators:

 files = [i for i in files if '(Live)' not in i] files = [i for i in files if '[Remastered]' not in i] files = [i for i in files if '(Mono' not in i]

becomes

files = [
    i for i in files
    if (
        '(Live)' not in i
        and '[Remastered]' not in i
        and '(Mono' not in i
    )
]

or, by means of De Morgan's laws ,

files = [
    i for i in files
    if not (
        '(Live)' in i
        or '[Remastered]' in i
        or '(Mono' in i
    )
]

Now, in order to be able to take the keywords from a predefined list and automatically adjust the number of conditions accordingly, we can use the built-in all or any functions:

  • a and b and c can be replaced by all([a, b, c])
  • a or b or c can be replaced by any([a, b, c])

(See: How to apply a logical operator to all elements in a python list )

Instead of a list, we can also pass a generator expression to all and any .

Therefore, the condition

if (
    '(Live)' not in i
    and '[Remastered]' not in i
    and '(Mono' not in i
)

can be written as

if all(keyword not in i for keyword in ['(Live)', '[Remastered]', '(Mono'])

and

if not (
    '(Live)' in i
    or '[Remastered]' in i
    or '(Mono' in i
)

as

if not any(keyword in i for keyword in ['(Live)', '[Remastered]', '(Mono'])

As a result, the code can become

keywords = ['(Live)', '[Remastered]', '(Mono']
files = [i for i in files if all(keyword not in i for keyword in keywords)]

or

keywords = ['(Live)', '[Remastered]', '(Mono']
files = [i for i in files if not any(keyword in i for keyword in keywords)]

Something like this

files = ['Upside Your Head.txt', 'The Mighty Quinn - [Remastered].txt', 'The Mighty Quinn - (live).txt', 'Fixin To Die (Mono version).txt', '10,000 Men - [Remastered].txt', '10,000 Men.txt', '10.000 Men - (live).txt']
words = {'Remastered','Live'}

def need_to_collect(entry):
    for word in words:
        if word in entry:
            return False
    return True
    
clean_list = [x for x in files if need_to_collect(x)]
print(clean_list)

output

['Upside Your Head.txt', 'The Mighty Quinn - (live).txt', 'Fixin To Die (Mono version).txt', '10,000 Men.txt', '10.000 Men - (live).txt']

You can specify the keywords in a list and then iterate the list to adjust your 'files'

keywords = ['(Live)','[Remastered]','(Mono']

for keyword in keywords:
  files = [i for i in files if keyword not in i]
print(files)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM