unable to plot two columns from DataFrame after using pandas.read_csv

Question

I'm trying to plot two columns that have been read in using pandas.read_csv, the code:-

from pandas import read_csv
from matplotlib import pyplot

data = read_csv('Stats.csv', sep=',')
#data = data.astype(float)
data.plot(x = 1, y = 2)

pyplot.show()

the csv file snippet:-

1,a4,2000,125,1.9,2.8,25.6
2,a4,7000,125,1.7,2.3,18
3,a2,7000,30,0.84,1.1,8.11
4,a2,5000,30,0.83,1.05,6.87
5,a2,4000,45,2.8,3.48,16.54

when x = 1 and y = 2 it will plot the second column against the fourth not the third as I expected

When I try to plot the third column against the fourth (x = 2, y = 3) it plots the third against the fifth

I'm trying to plot the third against the fourth right now, when both x and y = 2 it will plot the third column against the fourth but the values are incorrect, what am I missing? is the read_csv changing the order of the columns?

Answer 1

Your input csv is without headers which doesn't help clarity (see Murali's comment). But I think the problem stems from the nature of column that contains a4,a2.

This column can be used for the x axis but not for y axis (non-numeric data on an x axis appears to be just read in order). Hence the count offset. So as y "reads over" the column at 1 (all 0 indexed) - but x does not.

Conducting

 data.plot(x=1,y=0)

and

data.plot(x=0,y=1)

and inspecting the axis helps visualise what's going on.

Bizarrely this means you can do

 df.plot(x=1,y=1)

to get what you want.

unable to plot two columns from DataFrame after using pandas.read_csv

Question

1 answers

solution1
0 ACCPTED 2016-08-31 12:03:31

unable to plot two columns from DataFrame after using pandas.read_csv

Question

1 answers

solution1 0 ACCPTED 2016-08-31 12:03:31

solution1
0 ACCPTED 2016-08-31 12:03:31