plot 在 12 个月轴上同比

Question

I want to plot 6 years of 12 month period data on one 12 month axis from Dec - Jan.我想要 plot 从 12 月到 1 月的一个 12 个月轴上的 6 年 12 个月期间数据。

import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

df = pd.Series(np.random.randn(72), index=pd.date_range('1/1/2000', periods=72, freq='M'))

# display(df.head())
2000-01-31    0.713724
2000-02-29    0.416233
2000-03-31   -0.147765
2000-04-30    0.141021
2000-05-31    0.966261
Freq: M, dtype: float64

grouped = df.groupby(df.index.map(lambda x: x.year))

grouped.plot()

在此处输入图像描述

I'm getting the breaks in the lines between each year.我每年都在休息。 However, what I want to do is have the year stacked over each other.然而，我想做的是让年份相互叠加。 Any simple and clean ways to do it?有什么简单干净的方法吗？

Answer 1

There's probably a better way than this: 可能有一个比这更好的方法：

In [44]: vals = df.groupby(lambda x: (x.year, x.month)).sum()

In [45]: vals
Out[45]: 
(2000, 1)    -0.235044
(2000, 2)    -1.196815
(2000, 3)    -0.370850
(2000, 4)     0.719915
(2000, 5)    -1.228286
(2000, 6)    -0.192108
(2000, 7)    -0.337032
(2000, 8)    -0.174219
(2000, 9)     0.605742
(2000, 10)    1.061558
(2000, 11)   -0.683674
(2000, 12)   -0.813779
(2001, 1)     2.103178
(2001, 2)    -1.099845
(2001, 3)     0.366811
...
(2004, 10)   -0.905740
(2004, 11)   -0.143628
(2004, 12)    2.166758
(2005, 1)     0.944993
(2005, 2)    -0.741785
(2005, 3)     1.531754
(2005, 4)    -1.106024
(2005, 5)    -1.925078
(2005, 6)     0.400930
(2005, 7)     0.321962
(2005, 8)    -0.851656
(2005, 9)     0.371305
(2005, 10)   -0.868836
(2005, 11)   -0.932977
(2005, 12)   -0.530207
Length: 72, dtype: float64

Now change the index on vals to a MultiIndex 现在改变指数vals到MultiIndex

In [46]: vals.index = pd.MultiIndex.from_tuples(vals.index)

In [47]: vals.head()
Out[47]: 
2000  1   -0.235044
      2   -1.196815
      3   -0.370850
      4    0.719915
      5   -1.228286
dtype: float64

Then unstack and plot: 然后拆散并绘图：

In [48]: vals.unstack(0).plot()
Out[48]: <matplotlib.axes.AxesSubplot at 0x1171a2dd0>

在此输入图像描述

Answer 2

I think it is more clear, and easier to transform, if the data is a pandas.DataFrame , not a pandas.Series .如果数据是pandas.DataFrame而不是pandas.Series ，我认为它更清晰，更容易转换。
- The sample data in the OP is a pandas.Series , but it's going to be more typical for people looking to solve this question, if we begin with a pandas.DataFrame , so we'll begin by using .to_frame() OP 中的示例数据是pandas.Series ，但如果我们从pandas.DataFrame开始，对于希望解决此问题的人来说它会更典型，所以我们将首先使用.to_frame()
Extract the month and year component of the datetime index.提取datetime时间索引的month和year部分。
- This index is already a datetime dtype ;该索引已经是datetime dtype ； if your data is not, use pd.to_datetime() to convert the date index / column如果您的数据不是，请使用pd.to_datetime()转换日期索引/列
- If the data is a column, and not the index, then use the .dt accessor to get month and year (eg df[col].dt.year or df.index.year )如果数据是列，而不是索引，则使用.dt访问器获取month和year （例如df[col].dt.year或df.index.year ）
Use pandas.pivot_table to transform the dataframe from a long to wide format, and aggregate the data (eg 'sum' , 'mean' , etc.)使用pandas.pivot_table将 dataframe 从长格式转换为宽格式，并汇总数据（例如'sum' 、 'mean'等）
- This puts the dataframe into the correct shape to easily plot, without unstacking and further manipulation.这将 dataframe 变成了正确的形状，很容易变成 plot，无需拆分和进一步操作。
- The index will always be the x-axis, and the columns will be plotted.索引将始终是 x 轴，并且将绘制列。
- If there is not repeated data for a given 'month' , so no aggregation is required, then use pandas.DataFrame.pivot .如果给定'month'没有重复数据，则不需要聚合，则使用pandas.DataFrame.pivot 。
Plot the pivoted dataframe with pandas.DataFrame.plot Plot 旋转 dataframe 与pandas.DataFrame.plot

Tested in python 3.11 , pandas 1.5.2 , matplotlib 3.6.2在python 3.11 pandas 1.5.2 matplotlib 3.6.2中测试

import pandas as pd

# for this OP convert the Series to a DataFrame
df = df.to_frame()

# extract month and year from the index and create columns
df['month'] = df.index.month
df['year'] = df.index.year

# display(df.head(3))
                   0  month  year
2000-01-31  0.167921      1  2000
2000-02-29  0.523505      2  2000
2000-03-31  0.817376      3  2000

# transform the dataframe to a wide format
dfp = pd.pivot_table(data=df, index='month', columns='year', values=0, aggfunc='sum')

# display(dfp.head(3))
year       2000      2001      2002      2003      2004      2005
month                                                            
1      0.167921  0.637999 -0.174122  0.620622 -0.854315 -1.523579
2      0.523505 -0.344658 -0.280819  0.845543  0.782439 -0.593732
3      0.817376 -0.004282 -0.907424  0.352655  1.258275 -0.624112

# plot; us xticks=dfp.index so every month number is displayed
ax = dfp.plot(ylabel='Aggregated Sum', figsize=(6, 4), xticks=dfp.index)
# reposition the legend
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')

To get month names on the axis, create the 'month' column with:要在轴上获取月份名称，请使用以下内容创建'month'列：
- df['month'] = df.index.strftime('%b') , which get the month abbreviation df['month'] = df.index.strftime('%b') ，得到月份缩写

from calendar import month_abbr  # this is a sorted list of month name abbreviations

# for this OP convert the Series to a DataFrame
df = df.to_frame()

# extract the month abbreviation
df['month'] = df.index.strftime('%b')
df['year'] = df.index.year

# transform
dfp = pd.pivot_table(data=df, index='month', columns='year', values=0, aggfunc='sum')

# the dfp index so the x-axis will be in order
dfp = dfp.loc[month_abbr[1:]]

# display(dfp.head(3))
year       2000      2001      2002      2003      2004      2005
month                                                            
Jan    0.167921  0.637999 -0.174122  0.620622 -0.854315 -1.523579
Feb    0.523505 -0.344658 -0.280819  0.845543  0.782439 -0.593732
Mar    0.817376 -0.004282 -0.907424  0.352655  1.258275 -0.624112

# plot; using xticks=range(12) will result in all the xticks being labeled with a month, otherwise not all ticks will be displayed
ax = dfp.plot(ylabel='Aggregated Sum', figsize=(6, 4), xticks=range(12))
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')

This data is discrete data, because it's aggregated, so it really should be plotted as a bar plot.此数据是离散数据，因为它是聚合的，所以它确实应该绘制为条形 plot。

ax = dfp.plot(kind='bar', ylabel='Aggregated Sum', figsize=(12, 4), rot=0)
ax.legend(bbox_to_anchor=(1, 1.02), loc='upper left')

plot 在 12 个月轴上同比

问题描述

2 个解决方案

解决方案1
4 已采纳 2014-02-15 16:18:47

解决方案2
2 2021-09-13 18:45:37

plot 在 12 个月轴上同比

问题描述

2 个解决方案

解决方案1 4 已采纳 2014-02-15 16:18:47

解决方案2 2 2021-09-13 18:45:37

解决方案1
4 已采纳 2014-02-15 16:18:47

解决方案2
2 2021-09-13 18:45:37