简体   繁体   English

从enumerate()访问每个索引结果-Python

[英]Access each index result from enumerate() - Python

I have data, that looks like this: 我有数据,看起来像这样:

Name Nm1    *    *
Ind1     AACTCAGCTCACG
Ind2     GTCATCGCTACGA 
Ind3     CTTCAAACTGACT

I need to grab the letter from each position marked by an asterix in the "Name"-line and print this, along with the index of the asterix 我需要从“名称”行中用星号标记的每个位置抓取字母,并将其连同星号的索引一起打印

So the result would be 所以结果是

Ind1, 12, T
Ind2, 12, A
Ind3, 12, C
Ind1, 17, T
Ind2, 17, T
Ind3, 17, T

I'm trying to use enumerate() to retrieve the positions of the asterix's, and then my thought was, that I could use these indexes to grab the letters. 我试图使用enumerate()检索星号的位置,然后我的想法是,我可以使用这些索引来获取字母。

import sys
import csv

input = open(sys.argv[1], 'r')
Output = open(sys.argv[1]+"_processed", 'w')

indlist = (["Individual_1,", "Individual_2,", "Individual_3,"])

with (input) as searchfile:
    for line in searchfile:
        if '*' in line:
            LocusID = line[2:13]
            LocusIDstr = LocusID.strip()
            hit = line
            for i, x in enumerate(hit):
                     if x=='*':
                      position = i
                      print position

    for item in indlist:
        Output.write("%s%s%s\n" % (item, LocusIDstr, position))

Output.close()

If the enumerate() outputs eg 如果enumerate()输出例如

12
17

How do I access each index seperately? 如何分别访问每个索引?

Also, when I print the position, I get the numbers I want. 另外,当我打印位置时,我会得到想要的数字。 When I write to the file, however, only the last position is written. 但是,当我写入文件时,仅写入最后一个位置。 Why is this? 为什么是这样?

----------------EDIT----------------- - - - - - - - - 编辑 - - - - - - - - -

After advice below, I have edited split up my code to make it a bit more simple (for me) to understand. 根据下面的建议,我已经编辑了我的代码,使对我来说更简单了。

import sys
import csv

input = open(sys.argv[1], 'r')
Output = open(sys.argv[1]+"_FGT_Data", 'w')

indlist = (["Individual_1,", "Individual_2,", "Individual_3,"])

with (input) as searchfile:
    for line in searchfile:
        if '*' in line:
            LocusID = line[2:13]
            LocusIDstr = LocusID.strip()
            print LocusIDstr
            hit = line
            for i, x in enumerate(hit):
                    if x=='*':
                      position = i
                      #print position

input = open(sys.argv[1], 'r')

with (input) as searchfile:
    for line in searchfile:
        if line [0] == ">":
            print line[position], position

with (Output) as writefile:
    for item in indlist:
        writefile.write("%s%s%s\n" % (item, LocusIDstr, position))

Output.close()

I still do not have a solution for how to acces each of the indexes, though. 但是,对于如何访问每个索引,我仍然没有解决方案。

Edit changed to work with the file you gave me in your comment. 编辑已更改,以处理您在评论中给我的文件。 if you have made this file yourself, consider working with columns next time. 如果您自己创建了此文件,请考虑下次使用列。

import sys

read_file = sys.argv[1]
write_file = "%s_processed.%s"%(sys.argv[1].split('.')[0],sys.argv[1].split('.')[1])
indexes = []
lines_to_write = []

with open(read_file,'r') as getindex:
    first_line = getindex.readline()
    for i, x in enumerate(first_line):
        if x == '*':
            indexes.append(i-11)

with open(read_file,'r') as getsnps:
    for line in getsnps:
        if line.startswith(">"):
            sequence = line.split("    ")[1]
            for z in indexes:
                string_to_append = "%s\t%s\t%s"%(line.split("    ")[0],z+1,sequence[z])
                lines_to_write.append(string_to_append)

with open(write_file,"w") as write_file:
    write_file.write("\n".join(lines_to_write))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM