[英]Find and replace csv strings using a list in python
I have this so far. 到目前为止我有这个。
import csv
ifile = open('file', 'rb')
reader = csv.reader(ifile,delimiter='\t')
ofile = open('file', 'wb')
writer = csv.writer(ofile, delimiter='\t')
findlist = ['A', 'G', 'C', 'T', 'Y', 'R', 'W', 'S', 'K', 'M', 'X', 'N', '-']
replacelist = ['AA', 'GG', 'CC', 'TT', 'CT', 'AG', 'AT', 'GC', 'TG', 'CA',
'NN', 'NN', '-']
rep = dict(zip(findlist, replacelist))
def findReplace(find, replace):
s = ifile.read()
s = s.replace(find, replace)
ofile.write(s)
for item in findlist:
findReplace(item, rep[item])
ifile.close()
ofile.close()
What it does is replaced the A with AA. 它的作用是将A替换为AA。 However what I want is to replace all of the letters with the ones in the
replacelist
. 但是我想要的是将所有字母替换为
replacelist
中的replacelist
。 I am very new to python and can't quite figure out why its not replacing everything. 我是python的新手,并不能完全弄清楚为什么它不能取代所有东西。
HE670865 399908 N N N N N
HE670865 399910 N N N N N
HE670865 399945 T T N T T
HE670865 399951 R R N A A
HE670865 399957 A A N A A
HE670865 399978 C C C M C
HE670865 399980 C C C C C
HE670865 399982 T T T T K
HE670865 399984 C C C C C
HE670865 399908 N N N N N
HE670865 399910 N N N N N
HE670865 399945 T T N T T
HE670865 399951 R R N AA AA
HE670865 399957 AA AA N AA AA
HE670865 399978 C C C M C
HE670865 399980 C C C C C
HE670865 399982 T T T T K
HE670865 399984 C C C C C
It is because you are reading and writing inside the loop. 这是因为你在循环中阅读和写作。
rep = dict(zip(findlist, replacelist))
s = ifile.read()
for item in findlist:
s = s.replace(item, rep[item])
ofile.write(s)
Also, I think your code would be more readable (and more concise), without using the unnecessary dict
. 另外,我认为你的代码更具可读性(更简洁),而不使用不必要的
dict
。
s = ifile.read()
for item, replacement in zip(findlist, replacelist):
s = s.replace(item, replacement)
ofile.write(s)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.