简体   繁体   English

如何使用python找到最长的重复序列

[英]How to find the longest repeating sequence using python

I went through an interview, where they asked me to print the longest repeated character sequence.我经历了一次采访,他们让我打印最长的重复字符序列。

I got stuck is there any way to get it?我被卡住了有什么办法吗?

But my code prints only the count of characters present in a string is there any approach to get the expected output但是我的代码只打印字符串中存在的字符数是否有任何方法可以获得预期的输出

import pandas as pd
import collections

a   = 'abcxyzaaaabbbbbbb'
lst = collections.Counter(a)
df  = pd.Series(lst)
df

Expected output :预期输出:

bbbbbbb

How to add logic to in above code?如何在上面的代码中添加逻辑?

A regex solution:正则表达式解决方案:

max(re.split(r'((.)\2*)', a), key=len)

Or without library help (but less efficient):或者没有图书馆帮助(但效率较低):

s = ''
max((s := s * (c in s) + c for c in a), key=len)

Both compute the string 'bbbbbbb' .两者都计算字符串'bbbbbbb'

Without any modules, you could use a comprehension to go backward through possible sizes and get the first character multiplication that is present in the string:在没有任何模块的情况下,您可以使用推导式向后遍历可能的大小并获得字符串中存在的第一个字符乘法:

next(c*s for s in range(len(a),0,-1) for c in a if c*s in a)

That's quite bad in terms of efficiency though虽然这在效率方面很糟糕

another approach would be to detect the positions of letter changes and take the longest subrange from those另一种方法是检测字母变化的位置并从这些位置中取出最长的子范围

chg = [i for i,(x,y) in enumerate(zip(a,a[1:]),1) if x!=y]
s,e = max(zip([0]+chg,chg+[len(a)]),key=lambda se:se[1]-se[0])
longest = a[s:e]

Of course a basic for-loop solution will also work:当然,基本的 for 循环解决方案也适用:

si,sc = 0,"" # current streak (start, character)
ls,le = 0,0  # longest streak (start, end)
for i,c in enumerate(a+" "):      # extra space to force out last char.
    if i-si > le-ls: ls,le = si,i # new longest
    if sc != c:      si,sc = i,c  # new streak
longest = a[ls:le]

print(longest) # bbbbbbb

A more long winded solution, picked wholesale from:一个更冗长的解决方案,从以下批发商中挑选:
maximum-consecutive-repeating-character-string 最大连续重复字符串

def maxRepeating(str):
 
    len_s = len(str)
    count = 0
 
    # Find the maximum repeating
    # character starting from str[i]
    res = str[0]
    for i in range(len_s):
         
        cur_count = 1
        for j in range(i + 1, len_s):
            if (str[i] != str[j]):
                break
            cur_count += 1
 
        # Update result if required
        if cur_count > count :
            count = cur_count
            res = str[i]
    return res, count
 
# Driver code
if __name__ == "__main__":
    str = "abcxyzaaaabbbbbbb"
    print(maxRepeating(str))

Solution:解决方案:

('b', 7)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM