写一个 function 确定每行连续的 BA, CA 字符对的最大数量

Question

My respects, colleagues.尊敬的同事们。 I need to write a function that determines the maximum number of consecutive BA, CA character pairs per line.我需要写一个 function 来确定每行连续 BA、CA 字符对的最大数量。

print(f("BABABA125"))  # -> 3
print(f("234CA4BACA"))  # -> 2
print(f("BABACABACA56"))  # -> 5
print(f("1BABA24CA"))  # -> 2

Actually, I've written a function, but, to my mind, it's not very good.实际上，我已经写了一个 function，但是，在我看来，它不是很好。

def f(s: str) -> int:

    res = 0

    if not s:
        return res

    cur = 0
    i = len(s) - 1

    while i >= 0:
        if s[i] == "A" and (s[i-1] == "B" or s[i-1] == "C"):
            cur += 1
            i -= 2
        else:
            if cur > res:
                res = cur
                cur = 0
            i -= 1
    else:
        if cur > res:
            res = cur

    return res

In addition, I'm not allowed to use libraries and regular expressions (only string and list methods) .此外，我不允许使用库和正则表达式（仅限字符串和列表方法）。 Could you please help me or rate my code in this context.在这种情况下，您能否帮助我或评价我的代码。 I'll be very grateful.我将不胜感激。

Answer 1

Here's a function f2 that performs this operation.这是执行此操作的 function f2 。

if not re.search('(BA|CA)', s): return 0
First check if the string actually contains any BA or CA (to prevent ValueError: max() arg is an empty sequence on step 3), and return 0 if there aren't any.首先检查字符串是否实际包含任何BA或CA （以防止ValueError: max() arg is an empty sequence on step 3），如果没有则返回 0。
matches = re.finditer(r'(?:CA|BA)+', s)
Find all consecutive sequences of CA or BA , using non-capturing groups to ensure re.finditer outputs only full matches instead of partial matches.查找CA或BA的所有连续序列，使用非捕获组确保re.finditer只输出完全匹配而不是部分匹配。
res = max(matches, key=lambda m: len(m.group(0)))
Then, among the matches ( re.Match objects), fetch the matched substring using m.group(0) and compare their lengths to find the longest one.然后，在匹配项（ re.Match对象）中，使用m.group(0)获取匹配的 substring 并比较它们的长度以找到最长的一个。
return len(res.group(0))//2
Divide the length of the longest result by 2 to get the number of BA or CA s in this substring. Here we use floor division // to coerce the output into an int , since division would normally convert the answer to float .将最长结果的长度除以 2 以获得 substring 中BA或CA的数量。这里我们使用 floor 除法//将 output 强制转换为int ，因为除法通常会将答案转换为float 。

import re

strings = [
    "BABABA125",  # 3
    "234CA4BACA",  # 2
    "BABACABACA56",  # 5
    "1BABA24CA",  # 2
    "NO_MATCH_TO_BE_FOUND",  # 0
]

def f2(s: str):
    if not re.search('(BA|CA)', s): return 0
    matches = re.finditer(r'(?:CA|BA)+', s)
    res = max(matches, key=lambda m: len(m.group(0)))
    return len(res.group(0))//2

for s in strings:
    print(f2(s))

UPDATE: Thanks to @StevenRumbalski for providing a simpler version of the above answer.更新：感谢@StevenRumbalski 提供上述答案的更简单版本。 (I split it into multiple lines for readability) （为了便于阅读，我把它分成多行）

def f3(s):
    if not re.search('(BA|CA)', s): return 0
    matches = re.findall(r'(?:CA|BA)+', s)
    max_length = max(map(len, matches))
    return max_length // 2

if not re.search('(BA|CA)', s): return 0
Same as above同上
matches = re.findall(r'(?:CA|BA)+', s)
Find all consecutive sequences of CA or BA , but each value in matches is a str instead of a re.Match , which is easier to handle.查找CA或BA的所有连续序列，但matches中的每个值都是str而不是re.Match ，这样更容易处理。
max_length = max(map(len, matches))
Map each matched substring to its length and find the maximum length among them. Map 分别匹配 substring 到它的长度，找出其中的最大长度。
return max_length // 2
Floor divide the length of the longest matching substring by the length of BA , CA to get the number of consecutive occurrences of BA or CA in this string. Floor 将最长匹配的 substring 的长度除以BA , CA的长度，得到该字符串中BA或CA连续出现的次数。

Answer 2

Here's an alternative implementation without any imports.这是没有任何导入的替代实现。 Do note however that it's quite slow compared to your C-style implementation.但是请注意，与您的 C 风格实现相比，它非常慢。

The idea is simple: Transform the input string into a string consisting of only two types of characters c1 and c2 , with c1 representing CA or BA , and c2 representing anything else.思路很简单：将输入字符串转换为仅由两种字符c1和c2组成的字符串，其中c1代表CA或BA ， c2代表任何其他字符。 Then find the longest substring of consecutive c1 s.然后找到连续的c1最长的substring。

The implementation is as follows:实现如下：

Pick a char that is guaranteed not to appear in the input string;选择一个保证不会出现在输入字符串中的字符； here we use + as an example.这里我们以+为例。 Then pick a char different from the previous one;然后选择一个与前一个不同的字符； here we use - .这里我们使用- 。
Replace each occurrence of CA and BA with a + .将每次出现的CA和BA替换为+ 。
Replace everything else in the string (that is not a + ) with a - (this is why + cannot be present in the original input string).用-替换字符串中的所有其他内容（不是+ ）（这就是+不能出现在原始输入字符串中的原因）。 Now we have a string consisting purely of + s and - s.现在我们有一个完全由+ s 和- s 组成的字符串。
Split the string with - as delimiter, and map each resulting substring to their length.使用-作为分隔符拆分字符串，并且 map 每个结果 substring 到它们的长度。
Return the maximum of these substring lengths.返回这 substring 个长度中的最大值。

strings = [
    "BABABA125",  # 3
    "234CA4BACA",  # 2
    "BABACABACA56",  # 5
    "1BABA24CA",  # 2
    "NO_MATCH_TO_BE_FOUND",  # 0
]

def f4(string: str):
    string = string.replace("CA", "+")
    string = string.replace("BA", "+")
    string = "".join([(c if c == "+" else "-") for c in string])
    str_list = string.split("-")
    str_lengths = map(len, str_list)
    return max(str_lengths)

for s in strings:
    print(f4(s))

写一个 function 确定每行连续的 BA, CA 字符对的最大数量

问题描述

2 个解决方案

解决方案1
1 2023-02-01 15:32:06

解决方案2
0 已采纳 2023-02-01 20:26:49

写一个 function 确定每行连续的 BA, CA 字符对的最大数量

问题描述

2 个解决方案

解决方案1 1 2023-02-01 15:32:06

解决方案2 0 已采纳 2023-02-01 20:26:49

解决方案1
1 2023-02-01 15:32:06

解决方案2
0 已采纳 2023-02-01 20:26:49