[英]how do i get particular fields in python
I have two rows like below 我有两行如下
Tp1g00130_scaffold_1 blastn exon 20495 20602 . + .
Tp1g00130_scaffold_1 blastn exon 20650 20804 . + .
What i want to do is to merge the seq start (column 3 of row 1) and seq end (column 4 of row 2) of two lines if they have the same ID(column 1). 我想做的是如果两行具有相同的ID(第1列),则合并两行的seq起始(第1列的第3列)和seq结束(第2列的第4列)。 For example, the output would look like 例如,输出看起来像
Tp1g00130_scaffold_1 blastn exon 20495 20804 . + .
I made a good start but cannot quite finish. 我有一个良好的开端,但还不能完全结束。
prev = None
with open("test_parse") as fh_in:
for line in fh_in:
line = line.strip()
line = line.split()
line_id = line[0]
print line
if prev is not None and prev == line_id:
print "yes"
prev = line_id
Any help? 有什么帮助吗?
You're almost there. 你快到了。
Instead of prev
being just the id
, make it the whole last line. 不仅仅是prev
是id
,而是使它成为最后一行。 This lets us check existance and id ( if prev and prev[0] == line[0]:
) and get the seq start and seq end ( print('{} -> {}'.format(prev[3], line[4]))
). 这使我们可以检查是否存在和id( if prev and prev[0] == line[0]:
:)并获得seq开始和seq结束( print('{} -> {}'.format(prev[3], line[4]))
)。
prev = None
with open("test_parse") as fh_in:
for line in fh_in:
line = line.strip().split()
if prev and prev[0] == line[0]:
print(' '.join(prev).replace(prev[4], line[4]).split())
prev = line
If your file is small you can use a temporary dict. 如果文件很小,则可以使用临时字典。
records = {}
with open("test_parse") as fh_in:
for line in fh_in:
id_, f1, f2, start, end, f4, f5, f6 = line.strip().split()
if id_ in records:
records[id_][4] = end
else:
records[id_] = [id_, f1, f2, start, end, f4, f5, f6]
for line in records.values():
print "\t".join(line)
If you have aa header row in your file you can use a DictReader . 如果文件中有一个标题行,则可以使用DictReader 。
For a file with headers for columns x, y, and z you can do: 对于标题为x,y和z列的文件,您可以执行以下操作:
import DictReader
reader = DictReader(open('sample.csv'))
for line in reader:
print(line['x'], line['z'])
The csv module it is a part of is very helpful in general. 它的一部分csv模块通常非常有用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.