簡體   English   中英

在Python3.3中從字符串中刪除除字母和空格以外的所有內容

[英]Removing everything except letters and spaces from string in Python3.3

我有這個示例字符串: happy t00 go 129.129 ,我只想保留空格和字母。 到目前為止,我所能想到的是非常有效的:

print(re.sub("\d", "", 'happy t00 go 129.129'.replace('.', '')))

但這僅適用於我的示例字符串。 如何刪除字母和空格以外的所有字符?

whitelist = set('abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ')
myStr = "happy t00 go 129.129$%^&*("
answer = ''.join(filter(whitelist.__contains__, myStr))

輸出:

>>> answer
'happy t go '

使用一組補碼:

re.sub(r'[^a-zA-Z ]+', '', 'happy t00 go 129.129')

inspectorG4dget方法的細微變化-從string和生成器理解中導入:

from string import ascii_letters

allowed = set(ascii_letters + ' ')
myStr = 'happy t00 go 129.129'
answer = ''.join(l for l in myStr if l in allowed)
answer
# >>> 'happy t go '

性能比較:

(我使myStr更長,並預編譯了正則表達式,使事情變得更加有趣)

import re
from string import ascii_letters, digits
myStr = 'happy t00 go 129.129'*20
allowed = set(ascii_letters + ' ')

# Generator
%timeit answer = ''.join(l for l in myStr if l in allowed)

# filter/__contains__
%timeit answer = ''.join(filter(allowed.__contains__, myStr))

# Regex
pat = re.compile(r'[^a-zA-Z ]+')
%timeit answer = re.sub(pat, '', myStr)

每個循環53 µs±6.43 µs(平均±標准偏差,共運行7次,每個循環10000個)
每個回路43.3 µs±7.48 µs(平均±標准偏差,共運行7次,每個回路10000個)
每個循環26 µs±509 ns(平均±標准偏差,共運行7次,每個循環10000個)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM