简体   繁体   English

Pandas Pivot 时间序列按年份

[英]Pandas Pivot Time-series by year

Hello and thanks in advance for any help.您好,提前感谢您的帮助。 I have a simple dataframe with two columns.我有一个包含两列的简单数据框。 I did not set an index explicitly, but I believe a dataframe gets an integer index that I see along the left side of the output.我没有明确设置索引,但我相信数据框会获得我在输出左侧看到的整数索引。 Question below:下面的问题:

df = pandas.DataFrame(res)
df.columns = ['date', 'pb']
df['date'] = pandas.to_datetime(df['date'])
df.dtypes

date    datetime64[ns]
pb             float64
dtype: object


date    pb
0   2016-04-01  24199.933333
1   2016-03-01  23860.870968
2   2016-02-01  23862.275862
3   2016-01-01  25049.193548
4   2015-12-01  24882.419355
5   2015-11-01  24577.000000


date    datetime64[ns]
pb             float64
dtype: object

I would like to pivot the dataframe so that I have years across the top (columns): 2016, 2015, etc and a row for each month: 1 - 12.我想旋转数据框,以便顶部(列)有年份:2016、2015 等,每个月有一行:1-12。

Using the .dt accessor you can create columns for year and month and then pivot on those:使用.dt accessor ,您可以为年和月创建列,然后以这些为中心:

df['Year'] = df['date'].dt.year
df['Month'] = df['date'].dt.month
pd.pivot_table(df,index='Month',columns='Year',values='pb',aggfunc=np.sum)

Alternately if you don't want those other columns you can do:或者,如果您不想要那些其他列,您可以这样做:

pd.pivot_table(df,index=df['date'].dt.month,columns=df['date'].dt.year,
               values='pb',aggfunc=np.sum)

With my dummy dataset that produces:使用我的虚拟数据集产生:

Year             2013      2014      2015      2016
date
1             92924.0  102072.0  134660.0  132464.0
2             79935.0   82145.0  118234.0  147523.0
3             86878.0   94959.0  130520.0  138325.0
4             80267.0   89394.0  120739.0  129002.0
5             79283.0   91205.0  118904.0  125878.0
6             77828.0   89884.0  112488.0  121953.0
7             78839.0   94407.0  113124.0       NaN
8             79885.0   97513.0  116771.0       NaN
9             79455.0   99555.0  114833.0       NaN
10            77616.0   98764.0  115872.0       NaN
11            75043.0   95756.0  107123.0       NaN
12            81996.0  102637.0  114952.0       NaN

Using stack instead of pivot使用stack而不是pivot

df = pd.DataFrame(
         dict(date=pd.date_range('2013-01-01', periods=42, freq='M'),
              pb=np.random.rand(42)))

df.set_index([df.date.dt.month, df.date.dt.year]).pb.unstack()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM