简体   繁体   中英

removing words from a list using list comprehension

I´m trying to remove unnecessary words (an, a, the) from a list

Test = ['a', 'an', 'the', 'love']
unWantedWords = ['a', 'an', 'the']
RD1 = [x for x in Test if x != unWantedWords]
print(RD1)
output ->['a', 'an', 'the', 'love']

what is wrong w/ this?

The problem is you are comparing the value x to the entire list unWantedWords.

RD1 = [x for x in Test if x != unWantedWords]

Replace != with not in to check if x is... not in!

RD1 = [x for x in Test if x not in unWantedWords]
RD1 = [x for x in Test if x not in unWantedWords]

unWantedWords是一个数组,您将单词与一个数组共同映射,这就是它不起作用的原因。

If you don't mind:

  1. removing duplicates
  2. preserving the original order

you can simply use 'set' (here is the core algorithm):

>>> Test = ['a', 'an', 'the', 'love']
>>> unWantedWords = ['a', 'an', 'the']
>>> print set(Test) - set(unWantedWords)
set(['love'])

>>> print list(set(Test) - set(unWantedWords))
['love']

This has the advantage of an optimized complexity.

Of course you can wrap this code in order to keep duplicates and order...

This is wrong.

RD1 = [x for x in Test if x != unWantedWords]

your condition of if x != unWantedWords checks if x is equal to the list unWantedWords, instead of checking if x exists or not in unWantedWords.

The condition always is true because x is a string and not a list. Therefore all your words are added to the list.

The correct idiom would be if x not in unWantedWords .

You can do, RD1 = [x for x in Test if x not in set(unWantedWords)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM