简体   繁体   English

我如何 pivot 和堆叠 Pandas DataFrame?

[英]How can I pivot AND stack a Pandas DataFrame?

As a rather infrequent user of Pandas, I'd like to know how best to pivot one column (representing time) so that it flows horizontally, while stacking the rest based on another column or index.作为 Pandas 的一个相当少见的用户,我想知道如何最好地使用 pivot 一列(代表时间),使其水平流动,同时基于另一列或索引堆叠 rest。

Here is what I mean:这就是我的意思:

data = [
    [2018, "Alex", 172, 61], [2019, "Alex", 173, 62], [2020, "Alex", 173, 63],
    [2018, "Bill", 168, 59], [2019, "Bill", 168, 59], [2020, "Bill", 169, 60],
    [2018, "Cody", 193, 67], [2019, "Cody", 194, 69], [2020, "Cody", 194, 68],
]

df = pd.DataFrame(data, columns=["year", "name", "height", "weight"])

Which gives:这使:

year  name  height  weight

2018  Alex  172     61
2019  Alex  173     62
2020  Alex  173     63
2018  Bill  168     59
2019  Bill  168     59
2020  Bill  169     60
2018  Cody  193     67
2019  Cody  194     69
2020  Cody  194     68

I would like to pivot this DataFrame horizontally so that year flows horizontally ( ascending ), while essentially stacking all other columns grouped by name so that my dataframe looks like this:我想 pivot 这个 DataFrame 水平流动,使年份水平流动(升序),同时基本上堆叠按名称分组的所有其他列,以便我的 dataframe 看起来像这样:

Alex           2018  2019  2020
      height    172   173   173
      weight     61    62    63
Bill
      height    168   168   169
      weight     59    59    60
Cody
      height    193    194   69
      weight     67    194   68

In summary there are three things I am trying to accomplish here:总之,我想在这里完成三件事:

  1. Pivot horizontally by year Pivot 横向逐年
  2. Ensure that year flows in ascending order, ie lowest to highest确保年份按升序排列,即从最低到最高
  3. Group and stack the remaining columns by the name column按名称列对剩余列进行分组和堆叠

There are a lot of resources online about pivoting and stacking separately but not usually together like I am trying to do.网上有很多关于单独旋转和堆叠的资源,但通常不像我试图做的那样在一起。

Let's do set_index then stack to convert height and weight to row labels, and unstack year to make the year level into columns:让我们做set_index然后stack以将heightweight转换为行标签,并unstack year以将year级别转换为列:

new_df = df.set_index(['year', 'name']).stack().unstack('year')

new_df : new_df

year         2018  2019  2020
name                         
Alex height   172   173   173
     weight    61    62    63
Bill height   168   168   169
     weight    59    59    60
Cody height   193   194   194
     weight    67    69    68

*Note: stack and unstack are going to sort index levels when reshaping. *注意: stackunstack将在 reshaping 时对索引级别进行排序。

try this:尝试这个:

df.set_index(['year', 'name']).stack().unstack(0)
>>>
        year    2018    2019    2020
name                
Alex    height  172     173     173
        weight  61      62      63
Bill    height  168     168     169
        weight  59      59      60
Cody    height  193     194     194
        weight  67      69      68

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM