最长连续子序列的Python方法

Question

I have a sorted list of integers in a list called "black" and I'm looking for an elegant way to get start "s" and end "e" of the longest contiguous subsequence (the original problem had black pixels in a wxh-bitmap and I look for the longest line in a given column x). 我在名为“ black”的列表中有一个完整的整数列表，并且我正在寻找一种优雅的方式来获得最长连续子序列的“ s”和“ e”开头（原始问题是wxh-位图，我在给定的列x中寻找最长的线。 My solution works but looks ugly: 我的解决方案可行，但看起来很丑：

# blacks is a list of integers generated from a bitmap this way:
# blacks= [y for y in range(h) if bits[y*w+x]==1]

longest=(0,0)
s=blacks[0]
e=s-1
for i in blacks:
    if e+1 == i:   # Contiguous?
        e=i
    else:
        if e-s > longest[1]-longest[0]:
            longest = (s,e)
        s=e=i
if e-s > longest[1]-longest[0]:
    longest = (s,e)
print longest

I feel that this could be done in a smart one or two-liner 我觉得这可以用一个或两个聪明的班轮完成

Answer 1

You could do the following, using itertools.groupby and itertools.chain : 您可以使用itertools.groupby和itertools.chain进行以下操作：

from itertools import groupby, chain
l = [1, 2, 5, 6, 7, 8, 10, 11, 12]
f = lambda x: x[1] - x[0] == 1  # key function to identify proper neighbours

The following is still almost readable ;-) and gets you a decent intermediate step from which to proceed in a more sensible manner would probably be a valid option: 以下内容仍是几乎可读的;-)，并为您提供了一个不错的中间步骤，以更明智的方式进行操作可能是有效的选择：

max((list(g) for k, g in groupby(zip(l, l[1:]), key=f) if k), key=len)
# [(5, 6), (6, 7), (7, 8)]

In order to extract the actaul desired sequence [5, 6, 7, 8] in one line, you have to use some more kung-fu: 为了在一行中提取实际的所需序列[5, 6, 7, 8] 5、6、7、8 [5, 6, 7, 8] ，您必须使用更多的功夫：

sorted(set(chain(*max((list(g) for k, g in groupby(zip(l, l[1:]), key=f) if k), key=len))))
# [5, 6, 7, 8]

I shall leave it to you to work out the internals of this monstrosity :-) but keep in mind: a one-liner is often satisfying in the short run, but long-term, better opt for readability and code that you and your co-workers will understand. 我会留给您解决这种怪异的内部问题：-)，但要记住：单线通常在短期内令人满意，但从长远来看，更好地选择了您和您的合作伙伴的可读性和代码工人会明白的。 And readability is a big part of the Pythonicity you allude to. 可读性是您所暗示的Pythonic的重要组成部分。

Also note that this is O(log_N) because of the sorting. 还请注意，由于排序，这是O(log_N) 。 You can achieve the same by applying one of the O(N) duplicate removal techniques involving eg an OrderedDict to the output of chain and keep it O(N) , but that one line would get even longer. 您可以通过将O(N)重复删除技术中的一种O(N)例如涉及OrderedDict应用于chain的输出并将其保持为O(N) ，但是那一行会更长。

Update: 更新：

One of the O(N) ways to do it is DanD.'s suggestion which can be utilised in a single line using the comprehension trick to avoid assigning an intermediate result to a variable: O(N)实现方法是DanD。的建议，可以使用理解技巧在一行中使用该建议，以避免将中间结果分配给变量：

list(range(*[(x[0][0], x[-1][1]+1) for x in [max((list(g) for k, g in groupby(zip(l, l[1:]), key=f) if k), key=len)]][0]))
# [5, 6, 7, 8]

Prettier, however, it is not :D 更漂亮，但是，它不是：D

最长连续子序列的Python方法

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-10-29 21:08:14

Update: 更新：

最长连续子序列的Python方法

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-10-29 21:08:14

Update: 更新：

解决方案1
1 已采纳 2017-10-29 21:08:14