[英]Python cleaning words in a sentence
我正在尝试编写一个接受字符串(句子)然后清除它并返回所有字母,数字和连字符的函数。 但是代码似乎出错。 请知道我在这里做错了什么。
示例:Blake D'souza是一个!d!0t
应该返回:Blake D'souza是d0t
蟒蛇:
def remove_unw2anted(str):
str = ''.join([c for c in str if c in 'ABCDEFGHIJKLNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890\''])
return str
def clean_sentence(s):
lst = [word for word in s.split()]
#print lst
for items in lst:
cleaned = remove_unw2anted(items)
return cleaned
s = 'Blake D\'souza is an !d!0t'
print clean_sentence(s)
您只返回最后清除的单词!
应该:
def clean_sentence(s):
lst = [word for word in s.split()]
lst_cleaned = []
for items in lst:
lst_cleaned.append(remove_unw2anted(items))
return ' '.join(lst_cleaned)
较短的方法可能是这样的:
def is_ok(c):
return c.isalnum() or c in " '"
def clean_sentence(s):
return filter(is_ok, s)
s = "Blake D'souza is an !d!0t"
print clean_sentence(s)
使用string.translate
的变体有好处吗? 易于扩展,是string
一部分。
import string
allchars = string.maketrans('','')
tokeep = string.letters + string.digits + '-'
toremove = allchars.translate(None, tokeep)
s = "Blake D'souza is an !d!0t"
print s.translate(None, toremove)
输出:
BlakeDsouzaisand0t
OP表示仅保留字符,数字和连字符-也许它们也意味着保留空格?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.