简体   繁体   中英

Pandas Plotting from Pivot Table

I am basically trying to reproduce climate diagrams showing mean temperature and precipitation over the year for various locations.

I've generated a pivot table from my csv the following way:

data = pd.read_csv("05_temp_rain_v2.csv")
pivot = data.pivot_table(["rain(mm)","temp(dC)"], ["loc","month"])  

sample data in text form:

loc,lat,long,year,month,rain(mm),temp(dC)
Adria_-_Bellombra,45.011129,12.034126,1994,1,45.6,4.6  
Adria_-_Bellombra,45.011129,12.034126,1994,2,31.4,4  
Adria_-_Bellombra,45.011129,12.034126,1994,3,1.6,10.7  
Adria_-_Bellombra,45.011129,12.034126,1994,4,74.4,11.5  
Adria_-_Bellombra,45.011129,12.034126,1994,5,26,17.2  
Adria_-_Bellombra,45.011129,12.034126,1994,6,108.6,20.6

Pivot Table:

在此输入图像描述

Since I am handling various locations, I am iterating over them:

locations=pivot.index.get_level_values(0).unique()

for location in locations:
    split=pivot.xs(location)

    rain=split["rain(mm)"]
    temp=split["temp(dC)"]

    plt.subplots()
    temp.plot(kind="line",color="r",).legend()
    rain.plot(kind="bar").legend()

An example plot output is shown below:

在此输入图像描述

Why are my temperature values being plotted starting from February (2)?
I assume it is because the temperature values are listed in the second column.

What would be the proper way to handle and plot different data (two columns) from a pivot table?

It's because line and bar plots do not set the xlim the same way. The x-axis is interpreted as categorical data in case of the bar plot, whereas it is interpreted as continuous data for the line plot. The result being that xlim and xticks are not set identically in both situations.

Consider this:

In [4]: temp.plot(kind="line",color="r",)
Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x117f555d0>
In [5]: plt.xticks()
Out[5]: (array([ 1.,  2.,  3.,  4.,  5.,  6.]), <a list of 6 Text xticklabel objects>)

where the position of the ticks is an array of float ranging from 1 to 6 .

and

In [6]: rain.plot(kind="bar").legend()
Out[6]: <matplotlib.legend.Legend at 0x11c15e950>
In [7]: plt.xticks()
Out[7]: (array([0, 1, 2, 3, 4, 5]), <a list of 6 Text xticklabel objects>)

where the position of the ticks is an array of int ranging from 0 to 5 .

So, the easier is to replace this part:

temp.plot(kind="line", color="r",).legend()
rain.plot(kind="bar").legend()

by:

rain.plot(kind="bar").legend()
plt.plot(range(len(temp)), temp, "r", label=temp.name)
plt.legend()

酒吧线情节熊猫

Thanks to jeanrjc's answer and this thread I think I'm finally quite satisfied!

for location in locations:
#print(pivot.xs(location, level=0))

split=pivot.xs(location)
rain=split["rain(mm)"]
temp=split["temp(dC)"]

fig = plt.figure()
ax1 = rain.plot(kind="bar")
ax2 = ax1.twinx()
ax2.plot(ax1.get_xticks(),temp,linestyle='-',color="r")
ax2.set_ylim((-5, 50.))
#ax1.set_ylim((0, 300.))
ax1.set_ylabel('Precipitation (mm)', color='blue')
ax2.set_ylabel('Temperature (°C)', color='red')
ax1.set_xlabel('Months')
plt.title(location)
labels = ['Jan','Feb','Mar','Apr','May','Jun', 'Jul','Aug','Sep','Oct','Nov','Dez']
#plt.xticks(range(12),labels,rotation=45)
ax1.set_xticklabels(labels, rotation=45)  

I am receiving the following output, which is very close to what I intend: 样本图

You could loop over the results of a groupby operation:

for name, group in data[['loc', 'month', 'rain(mm)', 'temp(dC)']].groupby('loc'):
    group.set_index('month').plot()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM