[英]using for loop to replace bad nucleotides from DNA sequence
我有一个序列列表(为简单起见,如下所示)
seqList=["ACCTGCCSSSTTTCCT","ACCTGCCFFFTTTCCT"]
我想使用 for 循环将除 ["A","C","G","T"] 以外的每个核苷酸实例替换为 "N"
到目前为止我的代码
seqList=["ACCTGCCSSSTTTCCT","ACCTGCCFFFTTTCCT"]
for x in range(len(seqList)):
for i in range(len(seqList[x])):
if seqList[x][i] not in ["A","C","G","T"]:
seqList[x][i].replace(seqList[x][i],"N")
print(seqList)
问题是,核苷酸没有被替换,原始序列没有任何变化,我不知道原因!!!
python 中的字符串是不可变的。 你可以让ot像这样工作
seqList= ["ACCTGCCSSSTTTCCT","ACCTGCCFFFTTTCCT"]
for x in range(len(seqList)):
stringl=list(seqList[x])
for i in range(len(seqList[x])):
if seqList[x][i] not in ["A","C","G","T"]:
stringl[i].m="N"
seqList[x]="".join(stringl)
不循环所有字母的方法将替换所有不是ACGT
的字母
def replace_bad(seq):
unique = [
letter
for letter in set(seq)
if letter not in "ACGT"
]
for each in unique:
seq = seq.replace(each, "N")
return seq
if __name__ == '__main__':
for seq in seqList:
print(replace_bad(seq))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.