简体   繁体   中英

Exclude words from a list that contain one or more characters from another list python

I have a list as an input which contains words, these words sometimes contain non-ascii letter characters, I need to filter out the entire word if they contain letters that are not in the ascii list.

So the if the input is:

words = ['Hello', 'my','dear', 'de7ar', 'Fri?ends', 'Friends']

I need the Output:

['Hello', 'my', 'dear', Friends']


words = ['Hello', 'my','dear', 'de7ar', 'Fri?ends', 'Friends']
al = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = [char for char in al] 

filtered_words=[]

I tried it with this:

for el in words:  
    try:
        words in ascii_letters 
    except FALSE: 
        filtered_words.append(el)

and this

filtered words = [ele for ele in words if all(ch not in ele for ch in ascii_letters)] 

but both of them do not result in what I need - I do understand why but since I have only been learning python for a week I fail to adjust them to make them do what I want them to, maybe someone knows how to handle this (without using any libraries)? Thanks

You could check whether your alphabet is a superset of the words:

>>> [*filter(set(al).issuperset, words)]
['Hello', 'my', 'dear', 'Friends']

Btw, better don't hardcode that alphabet (I've seen quite a few people do that and forget letters) but import it:

from string import ascii_letters as al

You need to iterate trough the words in the words list to check whether all letters are ion ASCII or you can use the all() function:

words = ['Hello', 'my','dear', 'de7ar', 'Fri?ends', 'Friends']
al = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = [char for char in al]

out = []
for word in words:
    not_in_ascii = False
    for letter in word:
        if letter not in ascii_letters:
            not_in_ascii = True
    if not_in_ascii:
        continue
    out.append(word)

It is also possible with list comprehension and all() as you tried:

out = [word for word in words if all([letter in ascii_letters for letter in word])]
[i for i in words if i.isalpha()]

Result:

['Hello', 'my', 'dear', 'Friends']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM