简体   繁体   English

Python 在整数列表中查找重复序列?

[英]Python finding repeating sequence in list of integers?

I have a list of lists and each list has a repeating sequence.我有一个列表列表,每个列表都有一个重复序列。 I'm trying to count the length of repeated sequence of integers in the list:我正在尝试计算列表中重复的整数序列的长度:

list_a = [111,0,3,1,111,0,3,1,111,0,3,1] 

list_b = [67,4,67,4,67,4,67,4,2,9,0]

list_c = [1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,23,18,10]

Which would return:哪个会返回:

list_a count = 4 (for [111,0,3,1])

list_b count = 2 (for [67,4])

list_c count = 10 (for [1,2,3,4,5,6,7,8,9,0])

Any advice or tips would be welcome.欢迎任何建议或提示。 I'm trying to work it out with re.compile right now but, its not quite right.我现在正在尝试使用 re.compile 来解决它,但它不太正确。

Guess the sequence length by iterating through guesses between 2 and half the sequence length.通过迭代猜测序列长度的 2 到一半来猜测序列长度。 If no pattern is discovered, return 1 by default.如果没有发现模式,则默认返回 1。

def guess_seq_len(seq):
    guess = 1
    max_len = len(seq) / 2
    for x in range(2, max_len):
        if seq[0:x] == seq[x:2*x] :
            return x

    return guess

list_a = [111,0,3,1,111,0,3,1,111,0,3,1] 
list_b = [67,4,67,4,67,4,67,4,2,9,0]
list_c = [1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,23,18,10]

print guess_seq_len(list_a)
print guess_seq_len(list_b)
print guess_seq_len(list_c)
print guess_seq_len(range(500))   # test of no repetition

This gives (as expected):这给出了(如预期的那样):

4
2
10
1

As requested, this alternative gives longest repeated sequence.根据要求,此替代方案给出了最长的重复序列。 Hence it will return 4 for list_b.因此它将为 list_b 返回 4。 The only change is guess = x instead of return x唯一的变化是guess = x而不是return x

def guess_seq_len(seq):
    guess = 1
    max_len = len(seq) / 2
    for x in range(2, max_len):
        if seq[0:x] == seq[x:2*x] :
            guess = x

    return guess

I took Maria 's faster and more stackoverflow-compliant answer and made it find the largest sequence first:我采用了Maria更快、更符合 stackoverflow 的答案,并让它首先找到最大的序列:

def guess_seq_len(seq, verbose=False):
    seq_len = 1
    initial_item = seq[0]
    butfirst_items = seq[1:]
    if initial_item in butfirst_items:
        first_match_idx = butfirst_items.index(initial_item)
        if verbose:
            print(f'"{initial_item}" was found at index 0 and index {first_match_idx}')
        max_seq_len = min(len(seq) - first_match_idx, first_match_idx)
        for seq_len in range(max_seq_len, 0, -1):
            if seq[:seq_len] == seq[first_match_idx:first_match_idx+seq_len]:
                if verbose:
                    print(f'A sequence length of {seq_len} was found at index {first_match_idx}')
                break
    
    return seq_len

This worked for me.这对我有用。

def repeated(L):
    '''Reduce the input list to a list of all repeated integers in the list.'''
    return [item for item in list(set(L)) if L.count(item) > 1]

def print_result(L, name):
    '''Print the output for one list.'''
    output = repeated(L)
    print '%s count = %i (for %s)' % (name, len(output), output)

list_a = [111, 0, 3, 1, 111, 0, 3, 1, 111, 0, 3, 1]
list_b = [67, 4, 67, 4, 67, 4, 67, 4, 2, 9, 0]
list_c = [
    1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2,
    3, 4, 5, 6, 7, 8, 9, 0, 23, 18, 10
]

print_result(list_a, 'list_a')
print_result(list_b, 'list_b')
print_result(list_c, 'list_c')

Python's set() function will transform a list to a set, a datatype that can only contain one of any given value, much like a set in algebra. Python 的set()函数会将列表转换为集合,这是一种只能包含任何给定值之一的数据类型,很像代数中的集合。 I converted the input list to a set, and then back to a list, reducing the list to only its unique values.我将输入列表转换为一个集合,然后再转换回一个列表,将列表缩减为仅包含其唯一值。 I then tested the original list for each of these values to see if it contained that value more than once.然后,我针对这些值中的每一个测试了原始列表,看它是否多次包含该值。 I returned a list of all of the duplicates.我返回了所有重复项的列表。 The rest of the code is just for demonstration purposes, to show that it works.其余代码仅用于演示目的,以表明它可以正常工作。

Edit: Syntax highlighting didn't like the apostrophe in my docstring.编辑:语法突出显示不喜欢我的文档字符串中的撇号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM