Pandas Plotting from Pivot Table

Question

I am basically trying to reproduce climate diagrams showing mean temperature and precipitation over the year for various locations.

I've generated a pivot table from my csv the following way:

data = pd.read_csv("05_temp_rain_v2.csv")
pivot = data.pivot_table(["rain(mm)","temp(dC)"], ["loc","month"])

sample data in text form:

loc,lat,long,year,month,rain(mm),temp(dC)
Adria_-_Bellombra,45.011129,12.034126,1994,1,45.6,4.6  
Adria_-_Bellombra,45.011129,12.034126,1994,2,31.4,4  
Adria_-_Bellombra,45.011129,12.034126,1994,3,1.6,10.7  
Adria_-_Bellombra,45.011129,12.034126,1994,4,74.4,11.5  
Adria_-_Bellombra,45.011129,12.034126,1994,5,26,17.2  
Adria_-_Bellombra,45.011129,12.034126,1994,6,108.6,20.6

Pivot Table:

Since I am handling various locations, I am iterating over them:

locations=pivot.index.get_level_values(0).unique()

for location in locations:
    split=pivot.xs(location)

    rain=split["rain(mm)"]
    temp=split["temp(dC)"]

    plt.subplots()
    temp.plot(kind="line",color="r",).legend()
    rain.plot(kind="bar").legend()

An example plot output is shown below:

Why are my temperature values being plotted starting from February (2)?
I assume it is because the temperature values are listed in the second column.

What would be the proper way to handle and plot different data (two columns) from a pivot table?

Answer 1

It's because line and bar plots do not set the xlim the same way. The x-axis is interpreted as categorical data in case of the bar plot, whereas it is interpreted as continuous data for the line plot. The result being that xlim and xticks are not set identically in both situations.

Consider this:

In [4]: temp.plot(kind="line",color="r",)
Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x117f555d0>
In [5]: plt.xticks()
Out[5]: (array([ 1.,  2.,  3.,  4.,  5.,  6.]), <a list of 6 Text xticklabel objects>)

where the position of the ticks is an array of float ranging from 1 to 6 .

and

In [6]: rain.plot(kind="bar").legend()
Out[6]: <matplotlib.legend.Legend at 0x11c15e950>
In [7]: plt.xticks()
Out[7]: (array([0, 1, 2, 3, 4, 5]), <a list of 6 Text xticklabel objects>)

where the position of the ticks is an array of int ranging from 0 to 5 .

So, the easier is to replace this part:

temp.plot(kind="line", color="r",).legend()
rain.plot(kind="bar").legend()

by:

rain.plot(kind="bar").legend()
plt.plot(range(len(temp)), temp, "r", label=temp.name)
plt.legend()

Answer 2

Thanks to jeanrjc's answer and this thread I think I'm finally quite satisfied!

for location in locations:
#print(pivot.xs(location, level=0))

split=pivot.xs(location)
rain=split["rain(mm)"]
temp=split["temp(dC)"]

fig = plt.figure()
ax1 = rain.plot(kind="bar")
ax2 = ax1.twinx()
ax2.plot(ax1.get_xticks(),temp,linestyle='-',color="r")
ax2.set_ylim((-5, 50.))
#ax1.set_ylim((0, 300.))
ax1.set_ylabel('Precipitation (mm)', color='blue')
ax2.set_ylabel('Temperature (°C)', color='red')
ax1.set_xlabel('Months')
plt.title(location)
labels = ['Jan','Feb','Mar','Apr','May','Jun', 'Jul','Aug','Sep','Oct','Nov','Dez']
#plt.xticks(range(12),labels,rotation=45)
ax1.set_xticklabels(labels, rotation=45)

I am receiving the following output, which is very close to what I intend:

Answer 3

You could loop over the results of a groupby operation:

for name, group in data[['loc', 'month', 'rain(mm)', 'temp(dC)']].groupby('loc'):
    group.set_index('month').plot()

Pandas Plotting from Pivot Table

Question

3 answers

solution1
6 2016-03-21 16:15:02

solution2
2 2016-03-21 17:02:32

solution3
0 2016-03-21 15:53:37

Pandas Plotting from Pivot Table

Question

3 answers

solution1 6 2016-03-21 16:15:02

solution2 2 2016-03-21 17:02:32

solution3 0 2016-03-21 15:53:37

solution1
6 2016-03-21 16:15:02

solution2
2 2016-03-21 17:02:32

solution3
0 2016-03-21 15:53:37