[英]How to valid the DNA sequence in python?
这是问题。
编写一个脚本,从用户那里获取 dna 序列的输入并验证输入是否为 dna 序列。 如果输入是 dna 序列:-
这个 output 必须写入外部文件。
如果输入不是 dna 序列,将显示错误消息并询问用户是否输入正确的 dna 序列或程序将被终止。
除此之外,您还需要在脚本中显示 function 概念。
下面是我的脚本,
def main():
seq = input("enter dna sequence: ").casefold()
for letter in seq:
if letter not in "atgc":
answer = input("This is not dna.\nEnter 'Y' to enter a dna sequence or 'N' to terminate the program: ")
if answer == "Y":
answer = input("Plase enter: ")
elif answer == "N":
print("Program is terminated")
break
else:
print("Please enter Y or N")
else:
if letter in "acgt":
caps = seq.casefold()
length = len(seq)
fo = open("aa.txt", "w+")
fo.write("Sequence: " + seq)
fo.write("\nLength of sequence: " + str(length))
fo.write("\nPercentage of nucleotides:- " + "\n")
accepted_bases = ('a', 'c', 'g', 't')
for bases in accepted_bases:
count = caps.count(bases)
content = round(((count / length) * 100), 2)
fo.write(str(bases) + "=" + str(content) + "%" + "\n")
GC_Count = 0
for letter in seq:
if (letter == 'g' or letter == 'c'):
GC_Count += 1
GC = round((float(str(GC_Count)) / float(str(length)) * 100), 2)
fo.write("GC percentage: " + str(GC) + "%" + "\n")
RNA = seq.replace('t', 'u')
fo.write("The rna sequence: " + RNA)
fo.close()
if __name__ == "__main__":
main()
我基本上希望脚本检查用户的输入是否是 DNA 序列。 如果是,它将打印 output 文件。 如果不是,它会告诉用户输入的序列不正确,输入'Y'输入一个dna序列(如果是dna序列,则打印出output文件。)并输入'N'终止。 但是如果用户输入的字母不是'Y'和'N',它会打印(“请输入Y或N”)。
但我的问题是,python 显示我的输入不是 dna,即使在第二次测试时输入的 dna 是正确的。 我没有调用 python 来停止程序,但是当第二次输入 dna 和输入除 Y 和 N 以外的字母时它会自动停止。
有人知道这里有什么问题吗? 期待您的回复,谢谢。
我对DNA一无所知,但我认为这可能有效。
# dna = "gcacgctcccagcgatgctctctcagccctcacgggtcatctgaaataatcatattaccccacacaactggcctttgttctgatacatgcatttcgtcttaagcttagtaatcgtcgtattgacgaggaacgaaagttttaagtttttagatcgtattgtaacacgtccatgtgctaaagaacactgtgcgtttcccggatgactcgtgcaccgacattgagtccagctcgaatgacccccgacgctcctggatttcgcgttctcactcgattcccgctgatgaccgacgcgggaaaccattgtctcacgcagaagtccgatcccatatagagcgaaagtctctcagtctcatgactgagcaacattggcggcgaggaccgttggcccttctcgtgtacatcagacgcgcaacttccaatcttgtgcttccaatacatcgaagaaagtctatgatatagcagagaactggcctgtttgtcacttgcgcagaagggggcgtcaaactggaatgtcaacataacgccagtatctctaattttactcgacttcggtaacgcatcatgctacaggatcagttcatcctggagaaagctgtgacaatattcttactagcgcgcggaaggggggggtaactgacaggctgggtatgctgacgggggcgatcccaaatcgaaaactgcccttcccctcgcaacatgagaacaaaaattttgtaagtgaaaagccccctgaaacgtttcgccttgactctcttgagccccggggttttaatacataccccatctgattcgttctagtgctcaccaacactgctacatgatcataggttatatgtggtgcgcccttcgccaatgggcaccaagaaacctactgcgtaaaccaaccttggccgtcggcgaagcttctaagcactgtgtctcgcgaaagagagtaggacgccacctcggcatcaatgtagtacttatgtcggcacccgcatgcgtggtggtcgccctatcg"
def main():
while True:
seq = input("Enter DNA sequence: ")
if 'atgc' in seq:
caps = seq.casefold()
length = len(seq)
fo = open("aa.txt", "w+")
fo.write("Sequence: " + seq)
fo.write("\nLength of sequence: " + str(length))
fo.write("\nPercentage of nucleotides:- " + "\n")
accepted_bases = ('a', 'c', 'g', 't')
for bases in accepted_bases:
count = caps.count(bases)
content = round(((count / length) * 100), 2)
fo.write(str(bases) + "=" + str(content) + "%" + "\n")
GC_Count = 0
for letter in seq:
if (letter == 'g' or letter == 'c'):
GC_Count += 1
GC = round((float(str(GC_Count)) / float(str(length)) * 100), 2)
fo.write("GC percentage: " + str(GC) + "%" + "\n")
RNA = seq.replace('t', 'u')
fo.write("The rna sequence: " + RNA)
fo.close()
print("[+] File created!")
else:
print("[-] This is not a DNA sequence.")
again = input("Try another sequence? Y/N: ").capitalize().strip()
if again == 'Y':
continue
else:
break
if __name__ == "__main__":
main()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.