How to find positions of the last occurrence of a pattern in a string, and use these to extract a substring from another string

Question

I need some help with a specific problem, which I cannot seem to find on this website. I have a result which looks something like this:

result = "ooooooooooooooooooooooMMMMMMooooooooooooooooooMMMMMMooooooooooMMMMMMMMoo"

This is a transmembrane prediction. So for this string, I have another string of the same length, but is an amino acid code, for example:

amino_acid_code = "MSDENKSTPIVKASDITDKLKEDILTISKDALDKNTWHVIVGKNFGSYVTHEKGHFVYFYIGPLAFLVFKTA"

I want to do some research on the last "M" region. This can vary in length, as well as the "o" that comes after. So in this case I need to extract "PLAFLVFK" from the last string, which corresponds to the last "M" region.

I have something like this already, but I cannot figure out how to obtain the start position, and I also believe a simpler (or computationally better) solution is possible.

end = result.rfind('M')
start = ?
region_I_need = amino_acid_code[start:end]

Thanks in advance

Answer 1

To also find the start position, use rfind again after slicing off the characters after the end of the result string:

result = "ooooooooooooooooooooooMMMMMMooooooooooooooooooMMMMMMooooooooooMMMMMMMMoo"
amino_acid_code = "MSDENKSTPIVKASDITDKLKEDILTISKDALDKNTWHVIVGKNFGSYVTHEKGHFVYFYIGPLAFLVFKTA"

# add 1 to the indices to get the correct positions
end = result.rfind('M') + 1
start = result[:end].rfind('o') + 1
region_I_need = amino_acid_code[start:end]

print(start, end)
print(amino_acid_code[start:end])
>>> 62 70
>>> PLAFLVFK

How to find positions of the last occurrence of a pattern in a string, and use these to extract a substring from another string

Question

1 answers

solution1
1 ACCPTED 2018-03-23 10:54:15

How to find positions of the last occurrence of a pattern in a string, and use these to extract a substring from another string

Question

1 answers

solution1 1 ACCPTED 2018-03-23 10:54:15

solution1
1 ACCPTED 2018-03-23 10:54:15