I have a string of RNA ie:
AUGGCCAUA
I would like to generate all substrings by the following way:
#starting from 0 character
AUG, GCC, AUA
#starting from 1 character
UGG, CCA
#starting from 2 character
GGC, CAU
I wrote a code that solves the first sub-problem:
for i in range(0,len(rna)):
if fmod(i,3)==0:
print rna[i:i+3]
I have tried to change the starting position ie:
for i in range(1,len(rna)):
But it produces me the incorrect results:
GCC, UA #instead of UGG, CCA
Could you please give me a hint where is my mistake?
The problem with your code is that you are always extracting substring from the index which is divisible by 3. Instead, try this
a = 'AUGGCCAUA'
def getSubStrings(RNA, position):
return [RNA[i:i+3] for i in range(position, len(RNA) - 2, 3)]
print getSubStrings(a, 0)
print getSubStrings(a, 1)
print getSubStrings(a, 2)
Output
['AUG', 'GCC', 'AUA']
['UGG', 'CCA']
['GGC', 'CAU']
Explanation
range(position, len(RNA) - 2, 3)
will generate a list of numbers with common difference 3, starting from the position
till the length of the list - 2. For example,
print range(1, 8, 3)
1
is the starting number, 8
is the last number, 3
is the common difference and it will give
[1, 4, 7]
These are our starting indices. And then we use list comprehension to generate the new list like this
[RNA[i:i+3] for i in range(position, len(RNA) - 2, 3)]
Is this what you're looking for?
for i in range(len(rna)):
if rna[i+3:]:
print(rna[i:i+3])
outputs:
AUG
UGG
GGC
GCC
CCA
CAU
I thought of this oneliner:
a = 'AUGGCCAUA'
[a[x:x+3] for x in range(len(a))][:-2]
def generate(str, index):
for i in range(index, len(str), 3):
if len(str[i:i+3]) == 3:
print str[i:i+3]
Example:
In [29]: generate(str, 1)
UGG
CCA
In [30]: generate(str, 0)
AUG
GCC
AUA
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.