[英]How can I split a string on every kth occurence of a space but with overlap
我有一个字符串:
Your dog is running up the tree.
我希望能够在每个第 k 个空间上分割它但有重叠。 例如,每隔一个空间:
Your dog is running up the tree.
out = ['Your dog', 'dog is', 'is running', 'running up', 'up the', 'the tree']
每隔一个空间:
Your dog is running up the tree.
out = ['Your dog is', 'dog is running', 'is running up', 'running up the', 'up the tree']
我知道我可以做类似的事情
>>> i=iter(s.split('-'))
>>> map("-".join,zip(i,i))
但这不适用于我想要的重叠。 有任何想法吗?
我建议先在每个空格处拆分,然后在迭代列表时将所需数量的单词重新组合在一起
s = 'Your dog is running up the tree.'
lst = s.split()
def k_with_overlap(lst, k):
return [' '.join(lst[i:i+k]) for i in range(len(lst) - k + 1)]
k_with_overlap(lst, 2)
['Your dog', 'dog is', 'is running', 'running up', 'up the', 'the tree.']
我想这就是你可能需要的:
>>> s = 'Your dog is running up the tree.'
>>> n = 2
>>> [' '.join(s.split()[i:i+n]) for i in range(0,len(s.split()), n)]
['Your dog', 'is running', 'up the', 'tree.']
我尝试了以下方法,答案似乎正是您所期望的。
def split(sentence, space_num):
sent_array = sentence.split(' ')
length = len(sent_array)
output = []
for i in range(length+1-space_num):
list_comp = sent_array[i:i+space_num]
output.append(' '.join(list_comp))
return output
print(split('the quick brown fox jumped over the lazy dog', 5))
输出如下(尝试根据您的要求更改 space_num)。
['the quick brown fox jumped', 'quick brown fox jumped over', 'brown fox jumped over the', 'fox jumped over the lazy', 'jumped over the lazy dog']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.