Write a function that determines the maximum number of consecutive BA, CA character pairs per line

Question

My respects, colleagues. I need to write a function that determines the maximum number of consecutive BA, CA character pairs per line.

print(f("BABABA125"))  # -> 3
print(f("234CA4BACA"))  # -> 2
print(f("BABACABACA56"))  # -> 5
print(f("1BABA24CA"))  # -> 2

Actually, I've written a function, but, to my mind, it's not very good.

def f(s: str) -> int:

    res = 0

    if not s:
        return res

    cur = 0
    i = len(s) - 1

    while i >= 0:
        if s[i] == "A" and (s[i-1] == "B" or s[i-1] == "C"):
            cur += 1
            i -= 2
        else:
            if cur > res:
                res = cur
                cur = 0
            i -= 1
    else:
        if cur > res:
            res = cur

    return res

In addition, I'm not allowed to use libraries and regular expressions (only string and list methods) . Could you please help me or rate my code in this context. I'll be very grateful.

Answer 1

Here's a function f2 that performs this operation.

if not re.search('(BA|CA)', s): return 0
First check if the string actually contains any BA or CA (to prevent ValueError: max() arg is an empty sequence on step 3), and return 0 if there aren't any.
matches = re.finditer(r'(?:CA|BA)+', s)
Find all consecutive sequences of CA or BA , using non-capturing groups to ensure re.finditer outputs only full matches instead of partial matches.
res = max(matches, key=lambda m: len(m.group(0)))
Then, among the matches ( re.Match objects), fetch the matched substring using m.group(0) and compare their lengths to find the longest one.
return len(res.group(0))//2
Divide the length of the longest result by 2 to get the number of BA or CA s in this substring. Here we use floor division // to coerce the output into an int , since division would normally convert the answer to float .

import re

strings = [
    "BABABA125",  # 3
    "234CA4BACA",  # 2
    "BABACABACA56",  # 5
    "1BABA24CA",  # 2
    "NO_MATCH_TO_BE_FOUND",  # 0
]

def f2(s: str):
    if not re.search('(BA|CA)', s): return 0
    matches = re.finditer(r'(?:CA|BA)+', s)
    res = max(matches, key=lambda m: len(m.group(0)))
    return len(res.group(0))//2

for s in strings:
    print(f2(s))

UPDATE: Thanks to @StevenRumbalski for providing a simpler version of the above answer. (I split it into multiple lines for readability)

def f3(s):
    if not re.search('(BA|CA)', s): return 0
    matches = re.findall(r'(?:CA|BA)+', s)
    max_length = max(map(len, matches))
    return max_length // 2

if not re.search('(BA|CA)', s): return 0
Same as above
matches = re.findall(r'(?:CA|BA)+', s)
Find all consecutive sequences of CA or BA , but each value in matches is a str instead of a re.Match , which is easier to handle.
max_length = max(map(len, matches))
Map each matched substring to its length and find the maximum length among them.
return max_length // 2
Floor divide the length of the longest matching substring by the length of BA , CA to get the number of consecutive occurrences of BA or CA in this string.

Answer 2

Here's an alternative implementation without any imports. Do note however that it's quite slow compared to your C-style implementation.

The idea is simple: Transform the input string into a string consisting of only two types of characters c1 and c2 , with c1 representing CA or BA , and c2 representing anything else. Then find the longest substring of consecutive c1 s.

The implementation is as follows:

Pick a char that is guaranteed not to appear in the input string; here we use + as an example. Then pick a char different from the previous one; here we use - .
Replace each occurrence of CA and BA with a + .
Replace everything else in the string (that is not a + ) with a - (this is why + cannot be present in the original input string). Now we have a string consisting purely of + s and - s.
Split the string with - as delimiter, and map each resulting substring to their length.
Return the maximum of these substring lengths.

strings = [
    "BABABA125",  # 3
    "234CA4BACA",  # 2
    "BABACABACA56",  # 5
    "1BABA24CA",  # 2
    "NO_MATCH_TO_BE_FOUND",  # 0
]

def f4(string: str):
    string = string.replace("CA", "+")
    string = string.replace("BA", "+")
    string = "".join([(c if c == "+" else "-") for c in string])
    str_list = string.split("-")
    str_lengths = map(len, str_list)
    return max(str_lengths)

for s in strings:
    print(f4(s))

Write a function that determines the maximum number of consecutive BA, CA character pairs per line

Question

2 answers

solution1
1 2023-02-01 15:32:06

solution2
0 ACCPTED 2023-02-01 20:26:49

Write a function that determines the maximum number of consecutive BA, CA character pairs per line

Question

2 answers

solution1 1 2023-02-01 15:32:06

solution2 0 ACCPTED 2023-02-01 20:26:49

solution1
1 2023-02-01 15:32:06

solution2
0 ACCPTED 2023-02-01 20:26:49