[英]How to use a dictionary to modify a strings with python?
我有一個字符串和一個字典。 我必須用字典中的相應值替換字符串的部分(使用字典鍵)。
給定字符串: `
rna = AUGCAUGUACCGAAUGCUGAGGGGGCUUCCUAA
給定字典:
amino_acids = {"UUU" : "Phe", "UUC" : "Phe", "UUA" : "Leu", "UUG" : "Leu",
"CUU" : "Leu", "CUC" : "Leu", "CUA" : "Leu", "CUG" : "Leu",
"AUU" : "Ile", "AUC" : "Ile", "AUA" : "Ile", "AUG" : "Met",
"GUU" : "Val", "GUC" : "Val", "GUA" : "Val", "GUG" : "Val",
"UCU" : "Ser", "UCC" : "Ser", "UCA" : "Ser", "UCG" : "Ser",
"CCU" : "Pro", "CCC" : "Pro", "CCA" : "Pro", "CCG" : "Pro",
"ACU" : "Thr", "ACC" : "Thr", "ACA" : "Thr", "ACG" : "Thr",
"GCU" : "Ala", "GCC" : "Ala", "GCA" : "Ala", "GCG" : "Ala",
"UAU" : "Tyr", "UAC" : "Tyr", "UAA" : "STOP", "UAG" : "STOP",
"CAU" : "His", "CAC" : "His", "CAA" : "Gln", "CAG" : "Gln",
"AAU" : "Asn", "AAC" : "Asn", "AAA" : "Lys", "AAG" : "Lys",
"GAU" : "Asp", "GAC" : "Asp", "GAA" : "Glu", "GAG" : "Glu",
"UGU" : "Cys", "UGC" : "Cys", "UGA" : "STOP", "UGG" : "Trp",
"CGU" : "Arg", "CGC" : "Arg", "CGA" : "Arg", "CGG" : "Arg",
"AGU" : "Ser", "AGC" : "Ser", "AGA" : "Arg", "AGG" : "Arg",
"GGU" : "Gly", "GGC" : "Gly", "GGA" : "Gly", "GGG" : "Gly"
}
預期輸出:
Met-His-Val-Pro-Asn-Ala-Glu-Gly-Ala-Ser-*
問題:我做錯了什么? 沒有模塊如何做到這一點? 謝謝!
編輯:
解決方案
def rna_to_protein(rna):
acids = [rna[i:i+3] for i in range(0, len(rna), 3)]
protein ="-".join(amino_acids[acid] for acid in acids)
protein = protein.replace("STOP", "*")
return protein
您可以將rna
字符串拆分為長度為 3 的單個字符串:
>>> rna = "AUGCAUGUACCGAAUGCUGAGGGGGCUUCCUAA"
>>> acids = [rna[i:i+3] for i in range(0, len(rna), 3)]
>>> acids
['AUG', 'CAU', 'GUA', 'CCG', 'AAU', 'GCU', 'GAG', 'GGG', 'GCU', 'UCC', 'UAA']
然后您可以使用它們在字典中查找酸:
>>> "-".join(amino_acids[acid] for acid in acids)
'Met-His-Val-Pro-Asn-Ala-Glu-Gly-Ala-Ser-STOP'
保持簡單...
for key in amino_acids:
rna = rna.replace(key, amino_acids[key])
最簡單的方法是每 3 個字符,從字典中找到相應的值並將其附加到結果字符串中。
(但是,這是假設 RNA 總是完整的)
def rna_to_proteins(rna,amino_acids):
result = ""
i=0
# Traverse through every 3 characters and match with dictionary values
while i < len(rna):
# Get 3 characters
sequence = rna[i:i+3]
print(sequence)
if amino_acids[sequence] is not "STOP":
result+=amino_acids[sequence]
result+="-"
else:
result+="*"
i+=3
# Finally print the result string
print(result)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.