简体   繁体   中英

Finding the minimum of each column of a CSV file using python

I've created a program which finds the minimum of each row of a CSV file and I would now like to do the same for each column, however I have been unable to do so. Any advice would be greatly appreciated thank you.

       #Import and convert csv
  import csv
  data = []
  with open(file,"r") as f:
        reader = csv.reader(f, delimiter=',')
  #make sure csv uses "." not "," !!!!!
        nump = 0 
        for row in reader:
              floatrow = []
              for val in row:
                    floatrow.append(float(val))
              nump += len(floatrow)
              data.append(floatrow)

  #Calculates minimum of each row, minimum and sum of row           
  minrr = []
  sum1 = 0.0
  for row in data:
      list2 = (min (filter(None, row)))
      minrr.append(list2)
      sum1 += sum(row)

You could do it with only python built in commands and a transpose achieved by zipping all the rows as follows:

import csv
a = []
with open('path/to/file.csv',"r") as f:
        reader = csv.reader(f, delimiter=',')
        for row in reader:
            #turn all input to floats
            row = map (float, row)
            #append the entire row to create list of lists
            a.append(row)
# Transpose a into b
b = zip (*a)
# Now min of row will be min of col in a
for line in b:
    print min(line)

I'd suggest to use np.loadtxt to read the file as ndarray and perform np.min with a given axis:

import numpy as np

arr = np.loadtxt('your_file.csv')
# for each column
minima_c = np.min(arr, axis=0)

# for each row
minima_r = np.min(arr, axis=1)

Here's a little illustration:

In [1]: import numpy as np
In [2]: arr = np.arange(9).reshape((3,3))
In [3]: arr
Out[3]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
In [4]: np.min(arr, 0)
Out[4]: array([0, 1, 2])
In [5]: np.min(arr, 1)
Out[5]: array([0, 3, 6])

The following should work:

with open("data.csv", "r") as f_input:
    lmin_col = []
    lmin_row = []

    for row in csv.reader(f_input):
        row = map(float, row)
        lmin_row.append(min(row))

        if lmin_col:
            lmin_col = map(min, lmin_col, row)
        else:
            lmin_col = row

    print "Min per row:", lmin_row
    print "Min per col:", lmin_col

With the following as input:

10.1, 15.6, 12.3, 13.2, 17.0
2.1,  5.3,  7.0,  11.4, 5.5
12.1, 7.0,  9.3,  28.7, 1.0

It gives the following output:

Min per row: [10.1, 2.1, 1.0]
Min per col: [2.1, 5.3, 7.0, 11.4, 1.0]

Testing using Python 2.7. Below is also a possible alternative version for Python 3.0:

with open("data.csv", "r") as f_input:
    lmin_col = []
    lmin_row = []

    for row in csv.reader(f_input):
        row = [float(col) for col in row]
        lmin_row.append(min(row))

        if lmin_col:
            lmin_col = [min(x,y) for x,y in zip(lmin_col, row)]
        else:
            lmin_col = row

    print("Min per row:", lmin_row)
    print("Min per col:", lmin_col)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM