繁体   English   中英

如何在 python 的段落句子中设置字数限制?

[英]how set a word limit in paragraph's sentences in python?

当 append 在一个列表中时需要设置一个限制。

sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'

我只需要在一个句子中设置 5 个单词和 append 到一个列表

output 应该 -

sent_list = ['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']

尝试这个:

words = sent.split(' ')
sent_list = [' '.join(words[n:n+5]) for n in range(0, len(words), 5)]

可能有点不正统:

sent_list = [re.sub(r'\s$','',x.group('pattern')) for x in 
     re.finditer('(?P<pattern>([^\s]+\s){5}|.+$)',sent)]

['Python is dynamically-typed and garbage-collected.',
 'It supports multiple programming paradigms,',
 'including structured (particularly procedural), object-oriented',
 'and functional programming.']

解释'(?P<pattern>([^\s]+\s){5}|.+$)'

  • (?P<pattern>... ) : 修饰,创建一个命名的捕获组。
  • ([^\s]+\s){5} :查找非空白字符序列(一个或多个),后跟一个空格; 然后重复5次。
  • |.+$ :一旦第一个选项用尽,只需将最后一位完成。

我们使用re.finditer循环所有match objects并使用x.group('pattern')获取匹配。 除了最后一场比赛之外,所有比赛最后都会有一个额外的空格; 摆脱它的一种方法是使用re.sub

sent = 'Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly procedural), object-oriented and functional programming.'
sent_list = ['Python is dynamically-typed and garbage-collected.', 
            'It supports multiple programming paradigms,', 
            'including structured (particularly procedural), object-oriented', 
            'and functional programming.']

new_list = []
inner_string = ""
sentence_list = sent.split(" ")
for idx, item in enumerate(sentence_list):
    if (idx+1)==1 or (idx+1)%5 != 0:
        if (idx+1) == len(sentence_list):
            inner_string += item
            new_list.append(inner_string)
        else:
            inner_string += item + " "
    elif (idx+1)!=1 and (idx+1) % 5 == 0 :
        inner_string += item
        new_list.append(inner_string)
        inner_string = ""
        
print(new_list)
print(new_list == sent_list)

Output:

['Python is dynamically-typed and garbage-collected.', 'It supports multiple programming paradigms,', 'including structured (particularly procedural), object-oriented', 'and functional programming.']
True

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM