简体   繁体   中英

How do I get my inner for loop to iterate every time my outer for loop iterates?

I have two files, and I am trying to append the strings from the last column of the second file to an array within an array containing information in the first file. I want these strings to append only if the numbers in the second column of the second file fall between the numbers of the first and second columns of the first file.

Here are my files:

reads.bed:

chromA  10      69      read1
chromA  10      35      read2
chromA  10      55      read3
chromA  15      69      read4
chromA  80      119     read5
chromA  80      111     read6
chromA  90      119     read7
chromA  101     119     read8

feats.bed:

chromA  10      19      feat1
chromA  30      39      feat2
chromA  50      69      feat3
chromA  80      89      feat4
chromA  100     119     feat5

Here is my code:

feat_bed=open("feats.bed","r")
read_bed=open("reads.bed","r")


read_coords=[]
for line in read_bed.readlines():
    line=line.strip()
    line=line.split("\t")
    read_coords.append([int(line[1]),int(line[2]),str(line[3]),[]])


for read in read_coords:
    for feat in feat_bed.readlines():
        feat=feat.strip()
        feat=feat.split("\t")
        if int(read[1]) > int(feat[1]) >= int(read[0]):
            read[3].append(str(feat[3]))
    print read

My expected output would be:

[10, 69, 'read1', ['feat1', 'feat2', 'feat3']]
[10, 35, 'read2', ['feat1', 'feat2']]
[10, 55, 'read3', ['feat1', 'feat2', 'feat3']]
[15, 69, 'read4', ['feat2', 'feat3']]
[80, 119, 'read5', ['feat4', 'feat5']]
[80, 111, 'read6', ['feat4', 'feat5']]
[90, 119, 'read7', ['feat5']]
[101, 119, 'read8', []]

Instead, my inner for loop seems to iterate only the first time, and then it stops, so my actual output is:

[10, 69, 'read1', ['feat1', 'feat2', 'feat3']]
[10, 35, 'read2', []]
[10, 55, 'read3', []]
[15, 69, 'read4', []]
[80, 119, 'read5', []]
[80, 111, 'read6', []]
[90, 119, 'read7', []]
[101, 119, 'read8', []]

I don't understand why my inner loop stops iterating after the first iteration of my outer loop. If someone could point out what I'm doing wrong that would be super helpful. Thanks.

This happens because readlines() reads all lines from the current position in the file . So after the first call to readlines , the file pointer is at the end of the file and all subsequent calls to readlines() will return an empty list.

You want to save the lines to a list beforehand , like feat_lines = feat_bed.readlines() and then iterate on that pre-saved list of lines like: for feat in feat_lines: .

Using inner loops with identation:

feat_bed=open("feats.bed","r")
read_bed=open("reads.bed","r")


read_coords=[]
for line in read_bed.readlines():
    line=line.strip()
    line=line.split("\t")
    read = [int(line[1]),int(line[2]),str(line[3]),[]]

    for feat in feat_bed.readlines():
        feat=feat.strip()
        feat=feat.split("\t")
        if int(read[1]) > int(feat[1]) >= int(read[0]):
            read[3].append(str(feat[3]))
    print read

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM