简体   繁体   中英

How do you filter a string to only contain letters?

How do I make a function where it will filter out all the non-letters from the string? For example, letters("jajk24me") will return back "jajkme" . (It needs to be a for loop) and will string.isalpha() function help me with this?

My attempt:

def letters(input):
    valids = []
    for character in input:
        if character in letters:
            valids.append( character)
    return (valids)

If it needs to be in that for loop, and a regular expression won't do, then this small modification of your loop will work:

def letters(input):
    valids = []
    for character in input:
        if character.isalpha():
            valids.append(character)
    return ''.join(valids)

(The ''.join(valids) at the end takes all of the characters that you have collected in a list, and joins them together into a string. Your original function returned that list of characters instead)

You can also filter out characters from a string:

def letters(input):
    return ''.join(filter(str.isalpha, input))

or with a list comprehension:

def letters(input):
    return ''.join([c for c in input if c.isalpha()])

or you could use a regular expression, as others have suggested.

import re
valids = re.sub(r"[^A-Za-z]+", '', my_string)

EDIT: If it needs to be a for loop, something like this should work:

output = ''
for character in input:
    if character.isalpha():
        output += character

See re.sub , for performance consider a re.compile to optimize the pattern once.
Below you find a short version which matches all characters not in the range from A to Z and replaces them with the empty string. The re.I flag ignores the case, thus also lowercase ( az ) characters are replaced.

import re

def charFilter(myString)
    return re.sub('[^A-Z]+', '', myString, 0, re.I)

If you really need that loop there are many awnsers, explaining that specifically. However you might want to give a reason why you need a loop.

If you want to operate on the number sequences and thats the reason for the loop consider replacing the replacement string parameter with a function like:

import re

def numberPrinter(matchString) {
     print(matchString)
     return ''
}

def charFilter(myString)
    return re.sub('[^A-Z]+', '', myString, 0, re.I)

The method string.isalpha() checks whether string consists of alphabetic characters only. You can use it to check if any modification is needed. As to the other part of the question, pst is just right. You can read about regular expressions in the python doc: http://docs.python.org/library/re.html They might seem daunting but are really useful once you get the hang of them.

Of course you can use isalpha . Also, valids can be a string.

Here you go:

def letters(input):
    valids = ""
    for character in input:
        if character.isalpha():
            valids += character
    return valids

Not using a for-loop. But that's already been thoroughly covered.

Might be a little late, and I'm not sure about performance, but I just thought of this solution which seems pretty nifty:

set(x).intersection(y)

You could use it like:

from string import ascii_letters

def letters(string):
    return ''.join(set(string).intersection(ascii_letters))

NOTE: This will not preserve linear order. Which in my use case is fine, but be warned .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM