Python正則表達式，如何匹配不屬於字母的字母

Question

假設alphabet是一個字符列表。 我想從不屬於alphabet的字符串中刪除所有字符。 因此，如何匹配所有這些字符？

編輯： alphabet可以有任何字符，而不是必需的字母。

編輯2：很好奇，可以用正則表達式嗎？

Answer 1

使用字符串庫。 在這里，我使用string.ascii_letters，也可以添加數字。 在這種情況下，有效字符為：“ abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789”，如果需要還可以加上一些其他字符：“ -_。（）”

import string
def valid_name(input):
    valid_chars = "-_.() "+string.ascii_letters + string.digits
    return ''.join(c for c in input if c in valid_chars)

Answer 2

您實際上不需要正則表達式。 所有你需要的是：

# "alphabet" can be any string or list of any characters
alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 
            'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 
            'u', 'v', 'w', 'x', 'y', 'z']

# "oldstr" is your old string
newstr = ''.join([c for c in oldstr if c not in alphabet])

最后， newstr將是一個新字符串，其中僅包含oldstr不在alphabet中的alphabet 。 下面是一個演示：

>>> alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 
...             'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 
...             'u', 'v', 'w', 'x', 'y', 'z']
>>> oldstr = 'abc123'
>>> newstr = ''.join([c for c in oldstr if c not in alphabet])
>>> newstr
'123'
>>>

Answer 3

如果您想使用正則表達式：

使用此正則表達式：[^ a-zA-Z]

那將匹配所有非字母。 警告，這也將匹配空格。 為避免這種情況，請改用[a-zA-Z \\ s]。

更簡單的方法：

您實際上根本不需要正則表達式來執行此操作。 只需用可接受的字符組成一個字符串，然后過濾掉字符串中所有不在可接受字符中的字符。 例如：

import string #allows you to get a string of all letters easily

your_word = "hello123 this is a test!!!"
accepted_characters = string.lowercase + string.uppercase + " " #you need the whitespace at the end so it doesn't remove spaces
new_word = ""
for letter in your_word:
    if letter in accepted_characters:
        new_word += letter

那會給你“你好，這是一個測試”

超短方法：

該方法不是最易讀的方法，但只需一行即可完成。 它與上面的方法基本相同，但是利用列表理解和join方法將生成的列表轉換為字符串。

''.join([letter for letter in your_word if letter in (string.lowercase + string.uppercase + " ")])

Answer 4

代替正則表達式，這是使用str.translate()的解決方案：

import string

def delete_chars_not_in_alphabet(s, alphabet=string.letters):
    all_chars = string.maketrans('', '')
    all_except_alphabet = all_chars.translate(None, alphabet)
    return s.translate(None, all_except_alphabet)

例子：

>>> delete_chars_not_in_alphabet('<Hello World!>')
'HelloWorld'
>>> delete_chars_not_in_alphabet('foo bar baz', 'abo ')
'oo ba ba'

請注意，如果您使用相同的字母反復調用此函數， all_except_alphabet在函數外部構造all_except_alphabet （並且僅構造一次），以使其更加高效。

Answer 5

檢出re.sub，並使用否定的字符類，例如'[^ ad] '或'[^ abcd] '。 http://docs.python.org/2.7/library/re.html

Python正則表達式，如何匹配不屬於字母的字母

問題描述

5 個解決方案

解決方案1
1 2013-10-31 23:40:30

解決方案2
1 2013-10-31 23:41:45

解決方案3
0 2013-10-31 23:40:06

解決方案4
0 2013-10-31 23:42:14

解決方案5
-1 2013-10-31 23:43:46

Python正則表達式，如何匹配不屬於字母的字母

問題描述

5 個解決方案

解決方案1 1 2013-10-31 23:40:30

解決方案2 1 2013-10-31 23:41:45

解決方案3 0 2013-10-31 23:40:06

解決方案4 0 2013-10-31 23:42:14

解決方案5 -1 2013-10-31 23:43:46

解決方案1
1 2013-10-31 23:40:30

解決方案2
1 2013-10-31 23:41:45

解決方案3
0 2013-10-31 23:40:06

解決方案4
0 2013-10-31 23:42:14

解決方案5
-1 2013-10-31 23:43:46