[英]How do I stop regex from matching unwanted empty strings?
Im working on a problem set to count sentences.我正在研究一个计算句子的问题。 I decided to implement by using regular expressions to split the string at the characters "?, ., .".我决定通过使用正则表达式在字符“?,.,.”处拆分字符串来实现。 When I pass my text to re,split.当我将文本传递给 re,split 时。 it is including an empty string at the end of the list.它在列表末尾包含一个空字符串。
source code:源代码:
from cs50 import get_string
import re
def main():
text = get_string("Text: ")
cole_liau(text)
# Implement 0.0588 * L - 0.296 * S - 15.8; l = avg num of letters / 100 words , S = avg num of sentences / 100 words
def cole_liau(intext):
words = []
letters = []
sentences = re.split(r"[.!?]+", intext)
print(sentences)
print(len(sentences))
main()
Output: Output:
Text: Congratulations.文字:恭喜。 Today is your day.今天是你的好日子。 You're off to Great Places!你要去伟大的地方! You're off and away!你已经离开了!
['Congratulations', ' Today is your day', " You're off to Great Places", " You're off and away", '']
5 5
I tried adding the + expression to make sure it was matching at least 1 [.?.] but that did not work either.我尝试添加 + 表达式以确保它至少匹配 1 [.?.] 但这也不起作用。
You may use a comprehension:您可以使用理解:
def cole_liau(intext):
words = []
letters = []
sentences = [sent for sent in re.split(r"[.!?]+", intext) if sent]
print(sentences)
print(len(sentences))
Which yields哪个产量
['Congratulations', ' Today is your day', " You're off to Great Places", " You're off and away"]
4
As to why re.split()
returns an empty string, see this answer .至于为什么re.split()
返回一个空字符串,请看这个答案。
re.split
is working fine here. re.split
在这里工作正常。 You have a !
你有一个!
at the end of the last sentence, so it will split the text before (a sentence), and after (a null character).在最后一个句子的末尾,所以它会在(一个句子)之前和之后(一个 null 字符)分割文本。
You can just add [:-1]
at the end of your line to remove the last element of the list:您只需在行尾添加[:-1]
即可删除列表的最后一个元素:
sentences = re.split(r"[.!?]+", intext)[:-1]
Output: Output:
['Congratulations', ' Today is your day', " You're off to Great Places", " You're off and away"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.