简体   繁体   English

重塑Pandas中的数据帧

[英]Reshaping dataframe in Pandas

Is there a quick pythonic way to transform this table 是否有快速pythonic方式来转换此表

index = pd.date_range('2000-1-1', periods=36, freq='M')
df = pd.DataFrame(np.random.randn(36,4), index=index, columns=list('ABCD'))


In[1]: df

Out[1]: 
                   A         B         C         D
2000-01-31         H  1.368795  0.106294  2.108814
2000-02-29 -1.713401  0.557224  0.115956 -0.851140
2000-03-31 -1.454967 -0.791855 -0.461738 -0.410948
2000-04-30  1.688731 -0.216432 -0.690103 -0.319443
2000-05-31 -1.103961  0.181510 -0.600383 -0.164744
2000-06-30  0.216871 -1.018599  0.731617 -0.721986
2000-07-31  0.621375  0.790072  0.967000  1.347533
2000-08-31  0.588970 -0.360169  0.904809  0.606771
...

into this table 进入这张桌子

                       2001                                2000            
            12 11 10 9 8 7 6 5 4 3 2 1        12 11 10 9 8 7 6 5 4 3 2 1 
A                                                                      H
B
C
D

Please excuse the missing values. 请原谅缺失的值。 I added the "H" manually. 我手动添加了“H”。 I hope it gets clear what I am looking for. 我希望它能说清楚我在寻找什么。

For easier check, I've created dataframe of the same shape but with integers as values. 为了便于检查,我创建了相同形状的数据框,但是以整数作为值。

The core of the solution is pandas.DataFrame.transpose , but you need to use index.year + index.month as a new index: 解决方案的核心是pandas.DataFrame.transpose ,但您需要使用index.year + index.month作为新索引:

>>> df = pd.DataFrame(np.random.randint(10,size=(36, 4)), index=index, columns=list('ABCD'))
>>> df.set_index(keys=[df.index.year, df.index.month]).transpose()
  2000                                  2001                                  2002                                 
    1  2  3  4  5  6  7  8  9  10 11 12   1  2  3  4  5  6  7  8  9  10 11 12   1  2  3  4  5  6  7  8  9  10 11 12
A    0  0  8  7  8  0  7  1  5  1  5  4    2  1  9  5  2  0  5  3  6  4  9  3    5  1  7  3  1  7  6  5  6  8  4  1
B    4  9  9  5  2  0  8  0  9  5  2  7    5  6  3  6  8  8  8  8  0  6  3  7    5  9  6  3  9  7  1  4  7  8  3  3
C    3  2  4  3  1  9  7  6  9  6  8  6    3  5  3  2  2  1  3  1  1  2  8  2    2  6  9  6  1  5  6  5  4  6  7  5
D    8  1  3  9  2  3  8  7  3  2  1  0    1  3  9  1  8  6  4  7  4  6  3  2    9  8  9  9  0  7  4  7  3  6  5  2

Of course, this will not work properly if you have more then one record per year+month. 当然,如果你每年有超过一条记录+月,这将无法正常工作。 In this case you need to groupby your data first: 在这种情况下,您需要先将数据groupby

>>> i = pd.date_range('2000-1-1', periods=36, freq='W') # weekly index
>>> df = pd.DataFrame(np.random.randint(10,size=(36, 4)), index=i, columns=list('ABCD'))
>>> df.groupby(by=[df.index.year, df.index.month]).sum().transpose()
  2000                               
     1   2   3   4   5   6   7   8  9
A   12  13  15  23   9  21  21  31  7
B   33  24  19  30  15  19  20   7  4
C   20  24  26  24  15  18  29  17  4
D   23  29  14  30  19  12  12  11  5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM