[英]Never-ending loop? Can't get python to stop running
當我嘗試運行此代碼時,它永遠不會完成,我認為它卡在某個地方但我不太確定,因為我是 python 的新手。
import re
codon = []
rcodon = []
dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgatga"
startcodon=0
n=0
print ("DNA sequence: ", dataset)
def find_codon(codon, string, start):
i = start + 3
while i < len(string):
i = string.find(codon, i) # find the next substring
if (i - start) % 3 == 0: # check that it's a multiple of 3 after start
return i
return None
while(n < 1):
startcodon=dataset.find("atg", startcodon)
#locate stop codons
taacodon=find_codon("taa", dataset, startcodon)
tagcodon=find_codon("tag", dataset, startcodon)
tgacodon=find_codon("tga", dataset, startcodon)
stopcodon = min(taacodon, tagcodon, tgacodon)
codon.append(dataset[startcodon:stopcodon+3])
if(startcodon > len(dataset) or startcodon < 0):
n = 2;
startcodon=stopcodon
#reverse the string and swap the letters
n=0;
while(n < len(codon)):
rcodon.append (codon[n][len(codon[n])::-1])
#replace a with u
rcodon[n] = re.sub('a', "u", rcodon[n])
#replace t with a
rcodon[n] = re.sub('t', "a", rcodon[n])
#replace c with x
rcodon[n] = re.sub('c', "x", rcodon[n])
#replace g with c
rcodon[n] = re.sub('g', "c", rcodon[n])
#replace x with g
rcodon[n] = re.sub('x', "g", rcodon[n])
print("DNA sequence: ", codon[n] ,'\n', "RNA sequence:", rcodon[n])
n=n+1
answer = 0
print("Total Sequences: ", len(codon)-3)
while (int(answer) >=0):
#str = "Please enter an integer from 0 to " + str(len(dataset)) + " or -1 to quit: "
answer = int(input("Please enter a sequence you would like to see or -1 to quit: "))
if(int(answer) >= 0):
print("DNA sequence: ", codon[int(answer)] ,'\n', "RNA sequence:", rcodon[int(answer)])
任何意見將是有益的。
這是一個關於在沒有biopython 的情況下轉錄 DNA 的項目目標:創建一個程序,該程序可以定位 DNA 序列中的“atg”,然后在從初始 atg 開始計數時找到終止序列(tga、taa 或標簽)。
編輯:我希望程序給我 atg 和終止密碼子之間的序列,就像我的原始代碼一樣。 但是,我的原始代碼沒有考慮從 atg 移動 3 來找到正確的停止序列。
我的原始代碼:
import re
codon = []
rcodon = []
dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgatga"
startcodon=0
n=0
while(n < 1):
startcodon=dataset.find("atg", startcodon, len(dataset)-startcodon)
#locate stop codons
taacodon=dataset.find("taa", startcodon+3, len(dataset)-startcodon)
tagcodon=dataset.find("tag", startcodon+3, len(dataset)-startcodon)
tgacodon=dataset.find("tga", startcodon+3, len(dataset)-startcodon)
if(taacodon<tagcodon):
if(taacodon<tgacodon):
stopcodon=taacodon
#print("taacodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
elif(tgacodon>tagcodon):
stopcodon=tagcodon
#print("taGcodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
#to add sequences to an array
codon.append(dataset[startcodon:stopcodon+3])
if(startcodon > len(dataset) or startcodon < 0):
n = 2;
startcodon=stopcodon
#reverse the string and swap the letters
n=0;
while(n < len(codon)):
rcodon.append (codon[n][len(codon[n])::-1])
#replace a with u
rcodon[n] = re.sub('a', "u", rcodon[n])
#replace t with a
rcodon[n] = re.sub('t', "a", rcodon[n])
#replace c with x
rcodon[n] = re.sub('c', "x", rcodon[n])
#replace g with c
rcodon[n] = re.sub('g', "c", rcodon[n])
#replace x with g
rcodon[n] = re.sub('x', "g", rcodon[n])
print("DNA sequence: ", codon[n] ,'\n', "RNA sequence:", rcodon[n])
n=n+1
answer = 0
print("Total Sequences: ", len(codon)-3)
while (int(answer) >= 0):
#str = "Please enter an integer from 0 to " + str(len(dataset)) + " or -1 to quit: "
answer = int(input("Please enter an sequence you would like to see or -1 to quit: "))
if(int(answer) >= 0):
print("DNA sequence: ", codon[int(answer)] ,'\n', "RNA sequence:", rcodon[int(answer)])
您面臨的關於無限循環的問題是由於您的 function 注意到,一旦您找到可能的i
並且它不是 3 的倍數,您應該向其添加 3,否則i = string.find(codon, i)
將返回相同i
的價值,更正應該是:
def find_codon(codon, string, start):
i = start + 3
while i < len(string):
i = string.find(codon, i) # find the next substring
if (i - start) % 3 == 0: # check that it's a multiple of 3 after start
return i
else:
i += 3
return None
然后,您將在使用具有None
值的min
時遇到問題,並出現以下錯誤:
stopcodon = min(taacodon, tagcodon, tgacodon) TypeError: '<' 在 'NoneType' 和 'int' 的實例之間不支持
您應該將返回值設置為某個較大的數字,這將表明沒有找到任何東西,而不是None
上述代碼存在多個問題。 我將使用原版,因為那是后期編輯(所以我假設它是最新的)。
dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgatga"
startcodon=0
n=0
while(n < 1):
startcodon=dataset.find("atg", startcodon, len(dataset)-startcodon)
#locate stop codons
taacodon=dataset.find("taa", startcodon+3, len(dataset)-startcodon)
tagcodon=dataset.find("tag", startcodon+3, len(dataset)-startcodon)
tgacodon=dataset.find("tga", startcodon+3, len(dataset)-startcodon)
這不是按 3 組跳躍。這是通過字符串並定位其 position。 這就是為什么無論如何你總是會得到相同的價值。
if(taacodon<tagcodon):
if(taacodon<tgacodon):
stopcodon=taacodon
#print("taacodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
elif(tgacodon>tagcodon):
stopcodon=tagcodon
#print("taGcodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
我認為這是為了找到第一個終止密碼子。 但是,如果 find 找不到字符串,則 find 返回值 -1(並且由於您沒有標記,因此它始終是終止密碼子,即使它不存在)。
n=0;
while(n < len(codon)):
rcodon.append (codon[n][len(codon[n])::-1])
#replace a with u
rcodon[n] = re.sub('a', "u", rcodon[n])
#replace t with a
rcodon[n] = re.sub('t', "a", rcodon[n])
#replace c with x
rcodon[n] = re.sub('c', "x", rcodon[n])
#replace g with c
rcodon[n] = re.sub('g', "c", rcodon[n])
#replace x with g
rcodon[n] = re.sub('x', "g", rcodon[n])
print("DNA sequence: ", codon[n] ,'\n', "RNA sequence:", rcodon[n])
n=n+1
使用 dicts 和 fstrings,大大地清理東西。 我也不太明白為什么你有 c 到 x,然后 x 到 g。
最后,您的數據集不包含來自第一個 atg 的終止密碼子。 所以它不能以你想要的方式轉錄。
我在您的數據集末尾添加了一個終止密碼子,以獲取您希望執行此操作的 output:
dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgtaaa"
rdict={'a':'u','t':'a','c':'g','g':'c'}
start_codon=dataset.find("atg")
for nucleotides in range(start_codon+3,len(dataset),3):
if dataset[nucleotides:nucleotides+3] in {'taa','tag','tga'}:
stop_codon=nucleotides
DNA=[]
RNA=[]
for bases in range(start_codon,stop_codon,1):
DNA.append(dataset[bases])
RNA.append(rdict[dataset[bases]])
print(f"DNA Sequence: {''.join(DNA)}\nRNA Sequence: {''.join(RNA)}")
while True:
answer=input('\nplease input sequence you would like to see or exit to quit: ')
if answer == 'exit':
break
try:
print(f'DNA Sequence: {DNA[int(answer)]}\nRNA Sequence: {RNA[int(answer)]}')
except:
print('Entry invalid, please input number')
(您實際上可以簡化這一點並使用列表推導來使其更短,但我已經寫出了循環並制作了其中的 2 個,以便您了解總體思路)。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.