Python - Read Columns With Numpy

Question

I have a file with lets say with the following X,Y,Z columns

#file.csv
X,Y,Z
1,2,3
4,2,5
15,9,1
#

I am trying to use numpy to read column X and give me the average, standard deviation and other statistics. I cant get numpy to read them as columns like I want.

import numpy as np
import math 
my_data = np.genfromtxt(filename, delimiter=',', dtype=float, names=[x,y,z])

if I do something like np.average(my_data) it is averaging every row instead of every column. How can I make it average X, Y and Z and then print them out in a file?

And X have long numbers like 2747477447437.959843848 and I don't want to round them. These are IDs and should not be changed at all! How can I achieve this?

Answer 1

Choose axis = 0 to calculate the average (or something else) of a column. If you don't really need the first column, specify the usecols argument while using genfromtxt to choose the cols you want read.

In [1]: import numpy as np

In [2]: from StringIO import StringIO

In [3]: f = StringIO("""X,Y,Z
   ...: 1,2,3
   ...: 4,2,5
   ...: 15,9,1""")

In [4]: arr = np.genfromtxt(f, delimiter=',', dtype=float, skip_header=1)

In [5]: arr
Out[5]: 
array([[  1.,   2.,   3.],
       [  4.,   2.,   5.],
       [ 15.,   9.,   1.]])

In [6]: np.average(arr, axis=0)
Out[6]: array([ 6.66666667,  4.33333333,  3.        ])

Python - Read Columns With Numpy

Question

1 answers

solution1
0 2013-01-28 19:40:22

Python - Read Columns With Numpy

Question

1 answers

solution1 0 2013-01-28 19:40:22

solution1
0 2013-01-28 19:40:22