简体   繁体   中英

Iterating with csv.DictReader

I want to use csv.DictReader to read one column and depending on the value below, I would like to print the difference between the corresponding values in a different column: value below - the value above.

I wrote this script:

import csv

next=None
last = None
test_file = 'data.tsv'
csv_file = csv.DictReader(open(test_file, 'rU'), delimiter='\t')
for row in csv_file:
    if row['GT'] == "0/1":
        genotype = row['GT']
        if next is not None:
            if next == "0/1":
                position = int(row['pos'])
                if last is not None:
                    print  (position - last)
                last = position
        next = genotype

when I run it on data.tsv (see below) it does what it suppose to do, which is to print 80. Under the column GT, 0/1 occurs after 0/1 one time, and 832398-832318 = 80

pos GT
815069  0/0
825069  0/1
825410  ./.
830181  1/1
832318  0/1
832398  0/1
832756  0/0

However, when I set

if next == "0/0": (--> if first GT=0/1 and next GT=0/0, print the difference bewteen the corresponding values in the pos column, which is 832756-832398 = 358)

it prints nothing! Also when changing

if next == "./."

it does nothing

import csv

next=None
last = None
test_file = 'data.tsv'
csv_file = csv.DictReader(open(test_file, 'rU'), delimiter='\t')
for row in csv_file:
    if row['GT'] == "0/1":
        genotype = row['GT']
        if next is not None:
            **if next == "0/0":**
                position = int(row['pos'])
                if last is not None:
                    print  (position - last)
                last = position
        next = genotype

Any ideas why this might be? Thankful for any help! Let me know if I should clarify the description of the problem (Python beginner)

Regards Joanna

The variable next in your first script is confusing, actually it is not the next, but the current GT. The script works only by chance, because both GTs are equal (so the order doesn't matter).

As you iterate your file row by row it is hardly possible to look ahead, instead you could look back and compare the current GT with the last GT like this:

import csv

last_gt = None
last_pos = None
test_file = 'data.tsv'
csv_file = csv.DictReader(open(test_file, 'rU'), delimiter='\t')
    for row in csv_file:
        curr_gt = row['GT']
        curr_pos = int(row['pos'])
        if (curr_gt == "0/0") and (last_gt == "0/1"): # EDIT: 'and' instead of '&'
            print(curr_pos - last_pos)
        last_pos = curr_pos                           # EDIT: delete 'else' statement
        last_gt = curr_gt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM