[英]DICT function not working in RNA translate program
I was wondering what could be wrong with the code below and why I am getting an error KeyError: '[' ? 我想知道下面的代码可能出什么毛病,为什么我得到一个错误KeyError:'['?
The program is meant to translate the input DNA sequence to an RNA sequence and then from the RNA sequence stored in RNA [] produce the AMINO ACID sequence from the dict. 该程序旨在将输入的DNA序列翻译为RNA序列,然后从存储在RNA中的RNA序列[]从dict生成AMINO ACID序列。
Thanks 谢谢
DNA = "ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC"
RNA = []
AMINO_ACIDS = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
"UCU":"S", "UCC":"s", "UCA":"S", "UCG":"S",
"UAU":"Y", "UAC":"Y", "UAA":"STOP", "UAG":"STOP",
"UGU":"C", "UGC":"C", "UGA":"STOP", "UGG":"W",
"CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
"CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
"CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
"CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
"AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
"ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
"AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
"AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
"GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
"GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
"GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
"GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}
RNA_2 = str(RNA)
for char in DNA:
if char == "G":
RNA.append("C")
elif char == "C":
RNA.append("G")
elif char == "A":
RNA.append("U")
elif char == "T":
RNA.append("A")
translated = ''.join(AMINO_ACIDS[i] for i in RNA_2)
print("DNA sequence: " + DNA)
print()
print("Length of DNA sequence in base pairs: " + str(len(DNA)))
print()
print("RNA sequence of DNA sequence: " +("".join(RNA)))
print()
print("AMINO ACID sequence: " + str(translated))
You don't need RNA_2
, but you do need a way to split an RNA string into chunks of three character strings. 您不需要
RNA_2
,但确实需要一种将RNA字符串分成三个字符串的块的方法。 Borrowing a chunk function from this post : 从这篇文章中借用一个块函数:
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
for i in xrange(0, len(l), n):
yield l[i:i+n]
DNA = "ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC"
RNA = []
AMINO_ACIDS = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L",
"UCU":"S", "UCC":"s", "UCA":"S", "UCG":"S",
"UAU":"Y", "UAC":"Y", "UAA":"STOP", "UAG":"STOP",
"UGU":"C", "UGC":"C", "UGA":"STOP", "UGG":"W",
"CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L",
"CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P",
"CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q",
"CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R",
"AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M",
"ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T",
"AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K",
"AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R",
"GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V",
"GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A",
"GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E",
"GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",}
for char in DNA:
if char == "G":
RNA.append("C")
elif char == "C":
RNA.append("G")
elif char == "A":
RNA.append("U")
elif char == "T":
RNA.append("A")
translated = ''.join(AMINO_ACIDS[i] for i in chunks("".join(RNA), 3))
print("DNA sequence: " + DNA)
print()
print("Length of DNA sequence in base pairs: " + str(len(DNA)))
print()
print("RNA sequence of DNA sequence: " +("".join(RNA)))
print()
print("AMINO ACID sequence: " + str(translated))
Result: 结果:
DNA sequence: ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGC
()
Length of DNA sequence in base pairs: 69
()
RNA sequence of DNA sequence: UGUUCUACGGUAACAGGGGGCCGGAGGACGACGACGACGAGAGGCCCCGGUGCCGGUGGCGACGGGACG
()
AMINO ACID sequence: CSTVTGGRRTTTTRGPGAGGDGT
A little more about your original error. 有关原始错误的更多信息。 I think you may be misunderstanding what
RNA_2 = str(RNA)
does. 我认为您可能会误解
RNA_2 = str(RNA)
作用。 It doesn't mean "now and forever, RNA_2 will be the string version of RNA, and keep up-to-date whenever RNA changes". 这并不意味着“现在和永远,RNA_2将成为RNA的字符串版本,并在RNA发生变化时保持最新”。 It means "Take the contents of RNA at this instant in time, turn it into a string, and that's what RNA_2 will be, even when RNA changes later".
它的意思是“立即获取RNA的内容,将其变成字符串,这就是RNA_2的含义,即使RNA稍后发生变化也是如此”。 So
RNA_2
will be "[]" even after you've appended values to RNA. 因此,即使您将值附加到RNA后,
RNA_2
也将是“ []”。 This is the source of your KeyError. 这是您的KeyError的来源。 "[" is the first character of
RNA_2
, and "[" is not present in AMINO_ACIDS
. “ [”是
RNA_2
的第一个字符,“ [”在AMINO_ACIDS
不存在。
But even if you did RNA_2 = str(RNA)
after you finished your appending loop, I don't think it would give you the result you would want. 但是,即使在完成附加循环之后执行了
RNA_2 = str(RNA)
,我也不认为它会给您想要的结果。 It would be ['U', 'G', 'U', 'U', 'C', ...
rather than "UGUUC"
. 它应该是
['U', 'G', 'U', 'U', 'C', ...
而不是"UGUUC"
。 If you want the latter, you ought to use "".join(RNA)
rather than str(RNA)
. 如果需要后者,
"".join(RNA)
使用"".join(RNA)
而不是str(RNA)
。
But even if you use "".join(RNA)
, iterating through it and trying to access AMINO_ACIDS
won't work, because AMINO_ACID
's keys are all three characters long, and iterating over a string gives you one character at aa time. 但是,即使您使用
"".join(RNA)
,也无法对其进行迭代并尝试访问AMINO_ACIDS
,因为AMINO_ACID
的键都是三个字符长,并且在字符串上进行迭代可以一次给您一个字符。 That's where chunk
comes in, letting you iterate three characters at a time. 那就是
chunk
来历,让您一次迭代三个字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.