繁体   English   中英

如何在空格的第 k 次出现但重叠时拆分字符串

[英]How can I split a string on every kth occurence of a space but with overlap

我有一个字符串:

Your dog is running up the tree.

我希望能够在每个第 k 个空间上分割它但有重叠。 例如,每隔一个空间:

Your dog is running up the tree.
out = ['Your dog', 'dog is', 'is running', 'running up', 'up the', 'the tree']

每隔一个空间:

Your dog is running up the tree.
out = ['Your dog is', 'dog is running', 'is running up', 'running up the', 'up the tree']

我知道我可以做类似的事情

>>> i=iter(s.split('-'))                  
>>> map("-".join,zip(i,i)) 

但这不适用于我想要的重叠。 有任何想法吗?

我建议先在每个空格处拆分,然后在迭代列表时将所需数量的单词重新组合在一起

s = 'Your dog is running up the tree.'
lst = s.split()

def k_with_overlap(lst, k):
    return [' '.join(lst[i:i+k]) for i in range(len(lst) - k + 1)]

k_with_overlap(lst, 2)

['Your dog', 'dog is', 'is running', 'running up', 'up the', 'the tree.']

我想这就是你可能需要的:

>>> s = 'Your dog is running up the tree.'
>>> n = 2
>>> [' '.join(s.split()[i:i+n]) for i in range(0,len(s.split()), n)]
['Your dog', 'is running', 'up the', 'tree.']

我尝试了以下方法,答案似乎正是您所期望的。

def split(sentence, space_num):
    sent_array = sentence.split(' ')
    length = len(sent_array)
    output = []
    for i in range(length+1-space_num):
        list_comp = sent_array[i:i+space_num]
        output.append(' '.join(list_comp))
    return output

print(split('the quick brown fox jumped over the lazy dog', 5))

输出如下(尝试根据您的要求更改 space_num)。

['the quick brown fox jumped', 'quick brown fox jumped over', 'brown fox jumped over the', 'fox jumped over the lazy', 'jumped over the lazy dog']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM