简体   繁体   English

如何计算列表列表中相同值的最长序列,然后输出元组中的最大序列

[英]how to count the longest sequence of the same value in a list of lists, and then output the largest sequence in a tuple

I have a list of lists of lists 9in a text file) with values similar to what is below:我有一个文本文件中的列表列表 9),其值类似于以下内容:

L = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

This is the function I'm using:这是我正在使用的功能:

def longest_sequence(l):
        counter = 0
        sl = []
        sublists = []
        for i in l:
            if (l[counter+1]==l[counter]):
                sl.append(l[counter])
                counter = counter + 1
            else:
                counter = 0
                sublists.append([sl[i], len(sl)])
        return sublists

Right now counts only one value in this case the one that appears first (1) then it goes to the next line which is similar a sequence and the output i'm getting is this one:在这种情况下,现在只计算一个值,第一个出现的值 (1) 然后它转到下一行,它类似于一个序列,我得到的输出是这个:

returns the sub lists [[1, 111], [1, 222], [1, 333], [1, 444], [1, 555], [1, 666], [1, 777], [1, 888]] 

basically what I'm trying to do is checking the list and from that list verifying which sub-list has the longest length, so I should be getting something like this instead:基本上我要做的是检查列表并从该列表中验证哪个子列表的长度最长,所以我应该得到这样的东西:

sl = [(1, 111), (0, 395), (1, 65), (2, 358), (1, 71)]

Being the second tuple the one returned as ti contains the value that got repeated continuously 395 times (longest length) among all sub lists.作为第二个元组,作为 ti 返回的元组包含在所有子列表中连续重复 395 次(最长长度)的值。

You can do this simply with itertools.groupby() :你可以用itertools.groupby()简单地做到这一点:

In []:
import itertools as it

[(k, sum(1 for _ in g)) for k, g in it.groupby(L)]
# [(k, len(list(g)) for k, g in it.groupby(L)]  # alternative

Out[]:
[(1, 112), (0, 394), (1, 65), (2, 359), (1, 71)]

To get the maximum, you can use max() with a key , eg:要获得最大值,您可以将max()key一起使用,例如:

In []:
import operator as op

counts = [(k, sum(1 for _ in g)) for k, g in it.groupby(L)]
max(counts, key=op.itemgetter(1))

Out[]:
(0, 394)

However, fixing your code.但是,修复您的代码。

  • You are confusing your indexing ( counter ), when you reset it in the else: block you start from the beginning again.当您在else:块中重新设置索引( counter )时,您会混淆索引( counter ),您将再次从头开始。 Just use range(1, len(l)) in your for loop for the index.只需在for循环中使用range(1, len(l))作为索引。
  • You don't reset sl in the else: block (hence it keeps growing by 111 ) but you really don't need to create the sl list just count the items您不会在else:块中重置sl (因此它会不断增长111 )但您确实不需要创建sl列表,只需计算项目
  • You miss the case of the last value你错过了最后一个值的情况
  • Dealing with the last value needs a little reordering of logic处理最后一个值需要对逻辑进行一些重新排序

So fixed, it would look like:如此固定,它看起来像:

def longest_sequence(l):
    counter = 1
    sublists = []
    for i in range(1, len(l)):
        if l[i] != l[i-1]:
            sublists.append([l[i-1], counter])
            counter = 0
        counter += 1

    if counter > 0:
        sublists.append((l[i], counter))

    return sublists

In []:
longest_sequence(L)

Out[]:
[(1, 112), (0, 394), (1, 65), (2, 359), (1, 71)]

In []:
max(longest_sequence(L), key=op.itemgetter(1))

Out[]:
(0, 394)

You can use a run length encoding algorithm.您可以使用运行长度编码算法。 Example tool from the more_itertools library: more_itertools库中的示例工具:

Code代码

import more_itertools as mit    


list(mit.run_length.encode(L))
# [(1, 112), (0, 394), (1, 65), (2, 359), (1, 71)]

Details细节

The .encode method returns the equivalent of following generator expression: .encode方法返回以下生成器表达式的等效项:

((k, ilen(g)) for k, g in groupby(iterable))

You can optionally use the .decode method to get back the original list.您可以选择使用.decode方法取回原始列表。

Install via > pip install more_itertools .通过> pip install more_itertools

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM