How to use a comparison statement on a for loop iterator in a conditional statement?

Question

I'm iterating over a large (300+ columns & 1 000 000+ rows) .txt file (tab delimited). file format:

species 1    ...    sample1(11th col)    sample2    ....    sampleN(353th col)
species 2    ...    6046                 5364               ....
species 3    ...    15422                0                  ....

Each row is a species and from column 11 onward each column is a sample. For each sample I want to know how many species in that sample have a value of greater than 0. So what I do is iterate over each line, see for which samples the value is greater than 0, and if so add a 1. So for each sample the total sum of 1s is the total amount of rows that have a value greater than 0.

For that I use following code:

samples = []
OTUnumber = []

with open('all.16S.uniq.txt','r') as file:
     for i,line in enumerate(file): 
        columns = line.strip().split('\t')[11:353] 
        if i == 0: #headers are sample names so first row
            samples = columns #save sample names 
            OTUnumbers = [0 for s in samples] #set starting value as zero
        else:
            for n,v in enumerate(columns):
                if v > 0:
                    OTUnumber[n] = OTUnumber[n] + 1
                else:
                    continue

result = dict(zip(samples,OTUnumbers))

When I run thise code I get following error: TypeError: '>' not supported between instances of 'str' and 'int' This error is raised by if v > 0 . Why can't I write this statement?

So if v of columns [n] > 0 I want to add 1 to OTUnumber at that index. If v <0 I want to skip that row and do not add 1 (or add 0).

How can I make this code work?

Answer 1

When I run thise code I get following error: TypeError: '>' not supported between instances of 'str' and 'int' This error is raised by if v > 0 . Why can't I write this statement?

As the error says, you are trying to use the comparison operator > on a string and an int, which is not allowed. v is a string, not an integer. Presumably you want to use int(v) > 0 rather than v > 0 , or do the following to begin with.

columns = [int(v) for v in line.strip().split('\t')[11:353]]

Answer 2

try this:

samples = []
OTUnumbers = []

with open('all.16S.uniq.txt','r') as file:
     for i,line in enumerate(file): 
        columns = line.strip().split('\t')[11:353] 
        if i == 0: #headers are sample names so first row
            samples = columns #save sample names 
            OTUnumbers = [0 for s in samples] #set starting value as zero
        else:
            for n,v in enumerate(columns):
                if int(v) > 0:
                    OTUnumbers[n] = OTUnumbers[n] + 1
                else:
                    continue

result = dict(zip(samples,OTUnumbers))

that's basically 2 fixes:

casting v to int
renaming OTUnumber to OTUnumbers in all the code

Answer 3

So the thing is that in your text file there are records which are strings and your code is trying to compare an integer to a string which throws a TypeError exception

To make the code work you can convert your record to int before comparing ie, int(v) > 0

How to use a comparison statement on a for loop iterator in a conditional statement?

Question

3 answers

solution1
1 ACCPTED 2019-06-24 13:58:42

solution2
1 2019-06-24 14:01:50

solution3
1 2019-06-24 14:02:09

How to use a comparison statement on a for loop iterator in a conditional statement?

Question

3 answers

solution1 1 ACCPTED 2019-06-24 13:58:42

solution2 1 2019-06-24 14:01:50

solution3 1 2019-06-24 14:02:09

solution1
1 ACCPTED 2019-06-24 13:58:42

solution2
1 2019-06-24 14:01:50

solution3
1 2019-06-24 14:02:09