简体   繁体   English

Python 中的连续重复

[英]Consecutive repetitions in Python

I'm new in proggraming so don't be cruel to me :) I'm struggling with problem of the highest number of consecutive repetitions in a string.我是编程新手,所以不要对我残忍:) 我正在努力解决字符串中连续重复次数最多的问题。 I'm given a substring for example "ABC", then I have a file with sequences of letters ex.我得到了一个子字符串,例如“ABC”,然后我有一个包含字母序列 ex 的文件。 "ABC ABC BBC CDA ABC ABC ABC DBA"(here spaces not included,used only for better look). “ABC ABC BBC CDA ABC ABC ABC DBA”(此处不包括空格,仅用于更好看)。 Here output should be 3, this is the highest number of repetitions one after another.这里输出应该是3,这是一个接一个重复的最高次数。

I'm thinking of using str.count(sub[, start[, end]] method, but I have no idea how to use it in order to have valid output. I've been trying to create substring s = string[i][j] and then use s2 which is string[i+len(substring):j+len(substring)] but it seems too much cases so I gave up on it. Using code below I had valid output but only in few cases. I hope you'll help me with it. Thanks!我正在考虑使用str.count(sub[, start[, end]]方法,但我不知道如何使用它来获得有效的输出。我一直在尝试创建子字符串 s = string[i ][j] 然后使用 s2,它是string[i+len(substring):j+len(substring)]但它看起来太多了所以我放弃了它。使用下面的代码我有有效的输出,但只有少数案例。我希望你能帮助我。谢谢!

substr_count = 0
string = "ABCABCBBCCDAABCABCDBA"
while True:
    start = 0
    substring = "ABC"

    loc = string.find(substring,start)
    if loc == -1:
        break
    substr_count += 1
    start = loc + len(substring)

As usr2564301 said, itertools.groupby would be the way to go.正如 usr2564301 所说, itertools.groupby将是要走的路。 Here's a silly, kind of brute-force-ish way to go about it:这是一种愚蠢的、蛮力的方法:

def max_repititions(string, substring):
    if not substring:
        return 0
    for count in range(len(string), 0, -1):
        if substring*count in string:
            return count
    return 0

string = "ABCABCBBCCDAABCABCDBA"
substring = "ABC"

print(max_repititions(string, substring))

You can do this very easily with only three lines of code using regular expressions.使用正则表达式只需三行代码即可轻松完成此操作。

import re

string = "ABCABCBBCCDAABCABCDBA"

string_regex = re.compile(r'(ABC)*')
in_a_row = string_regex.search(string)
substr_count = len(str(in_a_row[0])) / len('ABC')
print(substr_count)

import re like you would any other package, define the string, put whatever you want to find in that string where the (ABC) is now and go.像导入任何其他包一样导入,定义字符串,将您想在该字符串中找到的任何内容放在 (ABC) 现在所在的位置然后去。

This works by searching a given string, in this case named 'string' for any number of repeating (that's what the asterisk is for) strings you define in the parenthesis.这通过搜索给定的字符串来工作,在本例中名为“字符串”,用于搜索您在括号中定义的任意数量的重复(这就是星号的含义)字符串。 Then simply take the length of in_a_row and divide it by the length of the string you asked it to find and you will be left with a numerical output of how many times it repeats.然后简单地取 in_a_row 的长度并将其除以您要求它查找的字符串的长度,您将得到一个重复次数的数字输出。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM