[英]Parse a text file in an array python
A C G T
A 2 -1 -1 -1
C -1 2 -1 -1
G -1 -1 2 -1
T -1 -1 -1 2
This file is separated by tabs as a text file and I want it to be mapped in a similar format to in python. 此文件由制表符分隔为文本文件,我希望它以与python类似的格式映射。
{'A': {'A': 91, 'C': -114, 'G': -31, 'T': -123},
'C': {'A': -114, 'C': 100, 'G': -125, 'T': -31},
'G': {'A': -31, 'C': -125, 'G': 100, 'T': -114},
'T': {'A': -123, 'C': -31, 'G': -114, 'T': 91}}
I have tried very had but I cannot figure out how to do this as I am new to python. 我已经尝试了很多,但我无法弄清楚如何做到这一点,因为我是python的新手。
Please help. 请帮忙。
My code so far: 我的代码到目前为止:
seq = flines[0]
newseq = []
j = 0
while(l < 4):
i = 2
while(o < 4):
newseq[i][j] = seqLine[i]
i = i + 1;
o = o + 1
j = j + 1
l = l + 1
print (seq)
print(seqLine)
I think this is what you want: 我想这就是你想要的:
import csv
data = {}
with open('myfile.csv', 'rb') as csvfile:
ntreader = csv.reader(csvfile, delimiter="\t", quotechar='"')
for rowI, rowData in enumerate(ntreader):
if rowI == 0:
headers = rowData[1:]
else:
data[rowData[0]] = {k: int(v) for k, v in zip(headers, rowData[1:])}
print data
To make life easy I use csv-module and just say tab is delimiter, then I grab the column headers on the first row and use them for all other rows to label the values. 为了简化生活,我使用csv-module,然后说tab是分隔符,然后我抓住第一行的列标题,并将它们用于所有其他行来标记值。
This produces: 这会产生:
{'A ': {'A': '2', 'C': '-1', 'T': '-1 ', 'G': '-1'},
'C': {'A': '-1', 'C': '2', 'T': '-1', 'G': '-1'},
'T': {'A': '-1', 'C': '-1', 'T': '2', 'G': '-1'},
'G': {'A': '-1', 'C': '-1', 'T': '-1', 'G': '2'}}
Edit* 编辑*
For python <2.7 it should work if you switch the dictionary comprehension line ( rowData[0]] = ....
) above and use a simple loop in the same place: 对于python <2.7,如果你切换上面的字典理解行( rowData[0]] = ....
)并在同一个地方使用一个简单的循环它应该工作:
rowDict = dict()
for k, v in zip(headers, rowData[1:]):
rowDict[k] = int(v)
data[rowData[0]] = rowDict
Using csv.DictReader
gets you most of the way there on your own: 使用csv.DictReader
可以自己获取大部分内容:
reader = DictReader('file.csv', delimiter='\t')
#dictdata = {row['']: row for row in reader} # <-- python 2.7+ only
dictdata = dict((row[''], row) for row in reader) # <-- python 2.6 safe
Outputs: 输出:
{'A': {None: [''], '': 'A', 'A': '2', 'C': '-1', 'G': '-1', 'T': '-1'},
'C': {'': 'C', 'A': '-1', 'C': '2', 'G': '-1', 'T': '-1'},
'G': {'': 'G', 'A': '-1', 'C': '-1', 'G': '2', 'T': '-1'},
'T': {'': 'T', 'A': '-1', 'C': '-1', 'G': '-1', 'T': '2'}}
To clean up the extraneous keys got messy, and I needed to rebuild the inner dict
, but replace the last line with this: 清理外来密钥变得混乱,我需要重建内部dict
,但用这个替换最后一行:
dictdata = {row['']: {key: value for key, value in row.iteritems() if key} for row in reader}
Outputs: 输出:
{'A': {'A': '2', 'C': '-1', 'G': '-1', 'T': '-1'},
'C': {'A': '-1', 'C': '2', 'G': '-1', 'T': '-1'},
'G': {'A': '-1', 'C': '-1', 'G': '2', 'T': '-1'},
'T': {'A': '-1', 'C': '-1', 'G': '-1', 'T': '2'}}
Edit: for Python <2.7 编辑:对于Python <2.7
Dictionary comprehensions were added in 2.7. 字典理解在2.7中添加。 For 2.6 and lower, use the dict
constructor: 对于2.6及更低版本,请使用dict
构造函数:
dictdata = dict((row[''], dict((key, value) for key, value in row.iteritems() if key)) for row in reader)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.