I have a list of words resembling the following
mylist=["hi", "h_ello", "how're", "you", "@list"]
I would like to pull out all of the non-alpha numeric characters to give a results such as:
"h_ello", "how're", "@list"
Please note I have a much longer list in real life, and it contains some non-alpha numeric instances such as ~, ?, >, =, + etc.
Does anyone know how to do this ,please? Thank you
Use str.isalpha()
Ex:
mylist=["hi", "h_ello", "how're", "you", "@list"]
print([i for i in mylist if not i.isalpha()])
Output:
['h_ello', "how're", '@list']
You can use a list comprehension
in combination with isalnum()
function.
mylist=["hi", "h_ello", "how're", "you", "@list"]
print([i for i in mylist if not i.isalnum()])
Output
['h_ello', "how're", '@list']
From python documentation :
str.isalnum()
Return true if all characters in the string are alphanumeric and there is at least one character, false otherwise. A character c is alphanumeric if one the following returns True:c.isalpha()
,c.isdecimal()
,c.isdigit()
, orc.isnumeric()
.
You can also use filter
with re
:
import re
mylist=["hi", "h_ello", "how're", "you", "@list"]
new_list = list(filter(lambda x:re.findall('[\W_]', x), mylist))
Output:
['h_ello', "how're", '@list']
Better you go for isalnum or regex , Here i tried a little different approach just for fun, This is not for production code it will take time, I just tried to show you a different way :
import unicodedata
import sys
mylist = ["hi", "h_ello", "how're", "you", "@list"]
def translate(text_):
pun=[i for i in range(sys.maxunicode) if unicodedata.category(chr(i)).startswith('P')]
if True in [True if ord(i) in pun else False for i in text_ ]:
return text_
print(list(filter(lambda x:x,[translate(i) for i in mylist])))
output:
['h_ello', "how're", '@list']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.