[英]how set a word limit in paragraph's sentences in python?
当 append 在一个列表中时需要设置一个限制。
sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'
我只需要在一个句子中设置 5 个单词和 append 到一个列表
output 应该 -
sent_list = ['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']
尝试这个:
words = sent.split(' ')
sent_list = [' '.join(words[n:n+5]) for n in range(0, len(words), 5)]
可能有点不正统:
sent_list = [re.sub(r'\s$','',x.group('pattern')) for x in
re.finditer('(?P<pattern>([^\s]+\s){5}|.+$)',sent)]
['Python is dynamically-typed and garbage-collected.',
'It supports multiple programming paradigms,',
'including structured (particularly procedural), object-oriented',
'and functional programming.']
解释'(?P<pattern>([^\s]+\s){5}|.+$)'
:
(?P<pattern>... )
: 修饰,创建一个命名的捕获组。([^\s]+\s){5}
:查找非空白字符序列(一个或多个),后跟一个空格; 然后重复5次。|.+$
:一旦第一个选项用尽,只需将最后一位完成。 我们使用re.finditer
循环所有match objects
并使用x.group('pattern')
获取匹配。 除了最后一场比赛之外,所有比赛最后都会有一个额外的空格; 摆脱它的一种方法是使用re.sub
。
sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'
sent_list = ['Python is dynamically-typed and garbage-collected.',
'It supports multiple programming paradigms,',
'including structured (particularly procedural), object-oriented',
'and functional programming.']
new_list = []
inner_string = ""
sentence_list = sent.split(" ")
for idx, item in enumerate(sentence_list):
if (idx+1)==1 or (idx+1)%5 != 0:
if (idx+1) == len(sentence_list):
inner_string += item
new_list.append(inner_string)
else:
inner_string += item + " "
elif (idx+1)!=1 and (idx+1) % 5 == 0 :
inner_string += item
new_list.append(inner_string)
inner_string = ""
print(new_list)
print(new_list == sent_list)
Output:
['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']
True
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.