简体   繁体   中英

How can I remove all non-letter (all languages) and non-numeric characters from a string?

I've been searching for quite some time now yet I can not find any explanation on the subject.

If I have a string, say: u'àaeëß35+{}"´' . I want all non-alphanumeric charachters removed (however, I want à, ë, ß etc. kept.

I'm fairly new to Python and I could not figure out a regex to perform this task. Only other solution I can think of is having a list with the chars I want to remove and iterating through the string replacing them.

What is the correct Pythonic solution here?

Thank you.

In [63]: s = u'àaeëß35+{}"´'

In [64]: print ''.join(c for c in s if c.isalnum())
àaeëß35

What about:

def StripNonAlpha(s):
    return "".join(c for c in s if c.isalpha())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM