Python: multiple function call appends to same list

Question

I'm coming from a Javascript background, and I know this works in Javascript, but what's fundamentally different here with Python?

I'm reading a CSV (sample below) and adding up all the values of a column (based on index parameter) into a list within the get_min_max function, sorting said list, and returning the first and last value in the list, for min and max, respectively.

The first call of get_min_max works great, but the second call fails. What happens is that the values from the second function call get appended to the first list.

How do I prevent the second function call from appending to the same list as the first function call? Clearly, I'm missing something fundamental about Python here.

Sample CSV

0,11,23
1,34,67
2,86,99
3,45,21
4,60,98
5,2,123
6,7,12
7,9,0

Sample Code

import csv

f = open("test.csv", "r")

reader = csv.reader(f, delimiter=",")

def get_min_max(reader, index):
    arr=[]
    for row in reader:
        arr.append(row[index])
    arr.sort()
    return {
        "min": arr[0],
        "max": arr[-1]
    }

get_min_max(reader, 1) # call no. 1
get_min_max(reader, 2) # call no. 2

ERROR

List index out of range on call no. 2. Returning the list on the second call returns empty list; returning the list on the first call returns list of values from the first call and the second call.

Thanks.

Answer 1

In the second call, data from reader has been consumed and hence returns nothing.

This illustrates the problem:

>>> f = open("test.csv", "r")
>>> import csv
>>> reader = csv.reader(f, delimiter=",")
>>> list(reader)
[['0', '11', '23'], ['1', '34', '67'], ['2', '86', '99'], ['3', '45', '21'], ['4', '60', '98'], ['5', '2', '123'], ['6', '7', '12'], ['7', '9', '0']]
>>> list(reader)
[]

Possible solutions: You can either cache the file data in some variables or reopen and read from the file within the function get_min_max

Answer 2

There are two errors: one that Anthony mentioned (reader already consumed the file) and another one - you're sorting the numbers as "strings" which means that "11" < "2".

To fix it:

import csv

def get_min_max(filename, index):
    f = open(filename, "r")
    reader = csv.reader(f, delimiter=",")
    arr=[]
    for row in reader:
        arr.append(int(row[index])) # <-- second fix 
    arr.sort()
    return {
        "min": arr[0],
        "max": arr[-1]
    }

print get_min_max("test.csv", 1) # prints {'max': 86, 'min': 2}
print get_min_max("test.csv", 2) # prints {'max': 123, 'min': 0}

Answer 3

Its because you already read through the file. File objects are only itterable once. you have to seek back to the beginning of the file using file.seek(0) or cache the data. Also you should convert those strings to ints because it will cause weird things like 11<9.

Answer 4

The above answers explain the cause of the program's failure.
If the file size is small(less than 10M), i suggest you first read file content into memeory then do whatever you what.

import csv

with open("test.csv", "r") as f:
    rows = [row for row in csv.reader(f, delimiter=",")]

def get_min_max(rows, index):
    arr=[]
    for row in rows:
        arr.append(row[index])
    arr.sort()
    return {
        "min": arr[0],
        "max": arr[-1]
    }

print get_min_max(rows, 1) # call no. 1
print get_min_max(rows, 2) # call no. 2

or use generator to decouple the file reader like this:

import csv

def csv_gen(fileName):
    with open(fileName, "r") as f:
        for row in csv.reader(f, delimiter=","):
            yield row

def get_min_max(rows, index):
    arr=[]
    for row in rows:
        arr.append(row[index])
    arr.sort()
    return {
        "min": arr[0],
        "max": arr[-1]
    }

print get_min_max(csv_gen("test.csv"), 1) # call no. 1
print get_min_max(csv_gen("test.csv"), 2) # call no. 2

Python: multiple function call appends to same list

Question

Sample CSV

Sample Code

ERROR

4 answers

solution1
1 ACCPTED 2014-08-06 01:28:05

solution2
1 2014-08-06 01:36:28

solution3
0 2014-08-06 01:42:19

solution4
0 2014-08-06 03:37:10

Python: multiple function call appends to same list

Question

Sample CSV

Sample Code

ERROR

4 answers

solution1 1 ACCPTED 2014-08-06 01:28:05

solution2 1 2014-08-06 01:36:28

solution3 0 2014-08-06 01:42:19

solution4 0 2014-08-06 03:37:10

solution1
1 ACCPTED 2014-08-06 01:28:05

solution2
1 2014-08-06 01:36:28

solution3
0 2014-08-06 01:42:19

solution4
0 2014-08-06 03:37:10