简体   繁体   English

使用python查找CSV文件的每列的最小值

[英]Finding the minimum of each column of a CSV file using python

I've created a program which finds the minimum of each row of a CSV file and I would now like to do the same for each column, however I have been unable to do so. 我已经创建了一个程序,它找到了CSV文件每行的最小值,我现在想对每个列执行相同的操作,但是我无法这样做。 Any advice would be greatly appreciated thank you. 任何建议将不胜感激,谢谢。

       #Import and convert csv
  import csv
  data = []
  with open(file,"r") as f:
        reader = csv.reader(f, delimiter=',')
  #make sure csv uses "." not "," !!!!!
        nump = 0 
        for row in reader:
              floatrow = []
              for val in row:
                    floatrow.append(float(val))
              nump += len(floatrow)
              data.append(floatrow)

  #Calculates minimum of each row, minimum and sum of row           
  minrr = []
  sum1 = 0.0
  for row in data:
      list2 = (min (filter(None, row)))
      minrr.append(list2)
      sum1 += sum(row)

You could do it with only python built in commands and a transpose achieved by zipping all the rows as follows: 你可以只使用python内置命令来完成它,并通过压缩所有行来实现转置,如下所示:

import csv
a = []
with open('path/to/file.csv',"r") as f:
        reader = csv.reader(f, delimiter=',')
        for row in reader:
            #turn all input to floats
            row = map (float, row)
            #append the entire row to create list of lists
            a.append(row)
# Transpose a into b
b = zip (*a)
# Now min of row will be min of col in a
for line in b:
    print min(line)

I'd suggest to use np.loadtxt to read the file as ndarray and perform np.min with a given axis: 我建议使用np.loadtxt将文件读取为ndarray并使用给定的轴执行np.min

import numpy as np

arr = np.loadtxt('your_file.csv')
# for each column
minima_c = np.min(arr, axis=0)

# for each row
minima_r = np.min(arr, axis=1)

Here's a little illustration: 这是一个小插图:

In [1]: import numpy as np
In [2]: arr = np.arange(9).reshape((3,3))
In [3]: arr
Out[3]: 
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
In [4]: np.min(arr, 0)
Out[4]: array([0, 1, 2])
In [5]: np.min(arr, 1)
Out[5]: array([0, 3, 6])

The following should work: 以下应该有效:

with open("data.csv", "r") as f_input:
    lmin_col = []
    lmin_row = []

    for row in csv.reader(f_input):
        row = map(float, row)
        lmin_row.append(min(row))

        if lmin_col:
            lmin_col = map(min, lmin_col, row)
        else:
            lmin_col = row

    print "Min per row:", lmin_row
    print "Min per col:", lmin_col

With the following as input: 以下输入:

10.1, 15.6, 12.3, 13.2, 17.0
2.1,  5.3,  7.0,  11.4, 5.5
12.1, 7.0,  9.3,  28.7, 1.0

It gives the following output: 它给出了以下输出:

Min per row: [10.1, 2.1, 1.0]
Min per col: [2.1, 5.3, 7.0, 11.4, 1.0]

Testing using Python 2.7. 使用Python 2.7进行测试。 Below is also a possible alternative version for Python 3.0: 下面是Python 3.0的另一个可能的替代版本:

with open("data.csv", "r") as f_input:
    lmin_col = []
    lmin_row = []

    for row in csv.reader(f_input):
        row = [float(col) for col in row]
        lmin_row.append(min(row))

        if lmin_col:
            lmin_col = [min(x,y) for x,y in zip(lmin_col, row)]
        else:
            lmin_col = row

    print("Min per row:", lmin_row)
    print("Min per col:", lmin_col)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM