[英]Removing everything except letters and spaces from string in Python3.3
我有這個示例字符串: happy t00 go 129.129
,我只想保留空格和字母。 到目前為止,我所能想到的是非常有效的:
print(re.sub("\d", "", 'happy t00 go 129.129'.replace('.', '')))
但這僅適用於我的示例字符串。 如何刪除字母和空格以外的所有字符?
whitelist = set('abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ')
myStr = "happy t00 go 129.129$%^&*("
answer = ''.join(filter(whitelist.__contains__, myStr))
輸出:
>>> answer
'happy t go '
使用一組補碼:
re.sub(r'[^a-zA-Z ]+', '', 'happy t00 go 129.129')
inspectorG4dget方法的細微變化-從string
和生成器理解中導入:
from string import ascii_letters
allowed = set(ascii_letters + ' ')
myStr = 'happy t00 go 129.129'
answer = ''.join(l for l in myStr if l in allowed)
answer
# >>> 'happy t go '
(我使myStr更長,並預編譯了正則表達式,使事情變得更加有趣)
import re
from string import ascii_letters, digits
myStr = 'happy t00 go 129.129'*20
allowed = set(ascii_letters + ' ')
# Generator
%timeit answer = ''.join(l for l in myStr if l in allowed)
# filter/__contains__
%timeit answer = ''.join(filter(allowed.__contains__, myStr))
# Regex
pat = re.compile(r'[^a-zA-Z ]+')
%timeit answer = re.sub(pat, '', myStr)
每個循環53 µs±6.43 µs(平均±標准偏差,共運行7次,每個循環10000個)
每個回路43.3 µs±7.48 µs(平均±標准偏差,共運行7次,每個回路10000個)
每個循環26 µs±509 ns(平均±標准偏差,共運行7次,每個循環10000個)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.