[英]using strip() in python
编写函数list_of_words,该函数接受如上所述的字符串列表,并返回删除了所有空格和标点符号的单个单词的列表(撇号/单引号除外)。
我的代码删除句点和空格,但不删除逗号或感叹号。
def list_of_words(list_str):
m = []
for i in list_str:
i.strip('.')
i.strip(',')
i.strip('!')
m = m+i.split()
return m
print(list_of_words(["Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated",
"to the proposition that all men are created equal. Now we are",
" engaged in a great civil war, testing whether that nation, or any",
"nation so conceived and so dedicated, can long endure!"])
清除某些标点符号和多个空格的最简单方法之一是使用re.sub
函数。
import re
sentence_list = ["Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated",
"to the proposition that all men are created equal. Now we are",
" engaged in a great civil war, testing whether that nation, or any",
"nation so conceived and so dedicated, can long endure!"]
sentences = [re.sub('([,.!]){1,}', '', sentence).strip() for sentence in sentence_list]
words = ' '.join([re.sub('([" "]){2,}', ' ', sentence).strip() for sentence in sentences])
print words
"Four score and seven years ago our fathers brought forth on this continent a new nation conceived in liberty and dedicated to the proposition that all men are created equal Now we are engaged in a great civil war testing whether that nation or any nation so conceived and so dedicated can long endure"
strip
返回字符串,您应该捕获并应用其余的strips。 因此您的代码应更改为
for i in list_str:
i = i.strip('.')
i = i.strip(',')
i = i.strip('!')
....
在第二个音符上, strip
仅在字符串的开头和结尾删除提到的字符。 如果要删除字符串之间的字符,则应考虑replace
您可以使用正则表达式,如本问题所述 。 实质上,
import re
i = re.sub('[.,!]', '', i)
如前所述,您需要将i.strip()
分配给i
。 而且如前所述,replace方法更好。 这是使用replace方法的示例:
def list_of_words(list_str:list)->list:
m=[]
for i in list_str:
i = i.replace('.','')
i = i.replace(',','')
i = i.replace('!','')
m.extend(i.split())
return m
print(list_of_words([ "Four score and seven years ago, our fathers brought forth on",
"this continent a new nation, conceived in liberty and dedicated",
"to the proposition that all men are created equal. Now we are",
" engaged in a great civil war, testing whether that nation, or any",
"nation so conceived and so dedicated, can long endure! ])
如您m=m+i.split()
,我还用m.append(i.split())
替换了m=m+i.split()
m.append(i.split())
以使其更易于阅读。
最好不要依赖于自己的标点列表,而要使用python的标点列表,并且当其他具有指针时,请使用regex删除字符:
punctuations = re.sub("[`']", "", string.punctuation)
i = re.sub("[" + punctuations + "]", "", i)
还有string.whitespace
,尽管split确实会为您处理它们。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.