[英]Pythonic: Find all consecutive sub-sequences of certain length
I have a list of integers and I want to find all consecutive sub-sequences of length n in this list. 我有一个整数列表,我想在此列表中找到长度为n的所有连续子序列。 For example:
例如:
>>> int_list = [1,4,6,7,8,9]
>>> conseq_sequences(int_list, length=3)
[[6,7,8], [7,8,9]]
The best I could come up with is: 我能想到的最好的是:
def conseq_sequences(self, li, length):
return [li[n:n+length]
for n in xrange(len(li)-length+1)
if li[n:n+length] == range(li[n], li[n]+length)]
This isn't overly readable. 这不是太可读。 Is there any readable pythonic way of doing this?
有任何可读的pythonic方式可以做到这一点吗?
Here's a more general solution that works for arbitrary input iterables (not just sequences): 这是更通用的解决方案,适用于任意输入可迭代项(不仅是序列):
from itertools import groupby, islice, tee
from operator import itemgetter
def consecutive_subseq(iterable, length):
for _, consec_run in groupby(enumerate(iterable), lambda x: x[0] - x[1]):
k_wise = tee(map(itemgetter(1), consec_run), length)
for n, it in enumerate(k_wise):
next(islice(it, n, n), None) # consume n items from it
yield from zip(*k_wise)
itertools.groupby
finds consecutive substrings such as 6, 7, 8, 9
in the input. itertools.groupby
发现连续子串,如6, 7, 8, 9
中的输入。 It is based on the example from the docs that shows how to find runs of consecutive numbers : 它基于文档中的示例,该示例显示了如何查找连续数字 :
The key to the solution is differencing with a range generated by enumerate() so that consecutive integers all appear in same group (run).
解决方案的关键是与enumerate()生成的范围相区别,以便连续的整数都出现在同一组中(运行)。
itertools.tee
+ zip
allow to iterate over the substring k-wise -- a generalization of pairwise
recipe from the itertools
docs . itertools.tee
+ zip
允许以k方式迭代子字符串- 来自itertools
docs的pairwise
配方的一般化。
next(islice(iterator, n, n), None)
is from the consume
recipe there . next(islice(iterator, n, n), None)
来自那里的consume
配方 。
Example: 例:
print(*consecutive_subseq([1,4,6,7,8,9], 3))
# -> (6, 7, 8) (7, 8, 9)
The code uses Python 3 syntax that could be adapted for Python 2 if needed. 该代码使用Python 3语法,如果需要,可以将其改编为Python 2。
See also, What is the most pythonic way to sort dates sequences? 另请参见, 对日期序列进行排序的最pythonic方法是什么?
One solution could be as follows: 一种解决方案如下:
import numpy # used diff function from numpy, but if not present, than some lambda or other helper function could be used.
def conseq_sequences(li, length):
return [int_list[i:i+length] for i in range(0, len(int_list)) if sum(numpy.diff(int_list[i:i+length]))==length-1]
Basically, first, I get consecutive sub-lists of given length from the list, and then check if the sum of the differences of their elements is equal to length - 1
. 基本上,首先,我从列表中获取给定长度的连续子列表,然后检查其元素之差的总和是否等于
length - 1
。
Please not that if elements are consecutive, their difference will add up to length - 1
, eg for sub-list [5,6,7]
the difference of its elements is [1, 1]
and sum of it is 2
. 请注意,如果元素是连续的,则它们的差之和将为
length - 1
,例如,对于子列表[5,6,7]
,其元素的差为[1, 1]
,总和为2
。
But to be honest not sure if this solution is clearer or more pythonic than yours. 但老实说,不确定此解决方案是否比您的解决方案更清晰或更pythonic。
Just in case you don't have numpy
, the diff
function can be easly defined as follows: 万一您没有
numpy
,可以轻松定义diff
函数,如下所示:
def diff(l):
'''For example, when l=[1,2,3] than return is [1,1]'''
return [x - l[i - 1] for i, x in enumerate(l)][1:]
Using operator.itemgetter and itertools.groupby 使用operator.itemgetter和itertools.groupby
def conseq_sequences(li, length):
res = zip(*(li[i:] for i in xrange(length)))
final = []
for x in res:
for k, g in groupby(enumerate(x), lambda (i, x): i - x):
get_map = map(itemgetter(1), g)
if len(get_map) == length:
final.append(get_map)
return final
Without imports. 没有进口。
def conseq_sequences(li, length):
res = zip(*(li[i:] for i in xrange(length)))
final = []
for ele in res:
if all(x == y+1 for x, y in zip(ele[1:], ele)):
final.append(ele)
return final
Which can be turned into list comprehension: 可以将其转化为列表理解:
def conseq_sequences(li, length):
res = zip(*(li[i:] for i in xrange(length)))
return [ ele for ele in res if all(x == y+1 for x, y in zip(ele[1:], ele))]
def condition (tup):
if tup[0] + 1 == tup[1] and tup[1] + 1 == tup[2] :
return True
return False
def conseq_sequence(li):
return [x for x in map(None, iter(li), iter(li[1:]), iter(li[2:])) if condition(x)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.