简体   繁体   English

在多索引 dataframe 中添加列

[英]Adding a column in a multi-indexed dataframe

I have a multi-indexed dataframe, where the left-most index is NBA Player, and the second level index is NBA Season (ie 2018-19).我有一个多索引的dataframe,其中最左边的索引是NBA Player,第二级索引是NBA Season(即2018-19)。 I'd like to add a column that numbers each players season.我想添加一个列,对每个球员的赛季进行编号。 For example on the head of the dateframe below, I'd like to add a column next to season that lists AJ Guyton's 2000-01 season as '1' and his 2001-02 season as '2'.例如,在下面的日期框架的顶部,我想在赛季旁边添加一列,将 AJ Guyton 的 2000-01 赛季列为“1”,将他的 2001-02 赛季列为“2”。 Then the process would repeat for the next player throughout the dataframe.然后该过程将在整个 dataframe 中为下一个玩家重复。

                     Age   Tm  OBPM  BPM  DBPM
Player      Season                            
A.J. Guyton 2000-01   22  CHI -0.57 -2.8  -2.1
            2001-02   23  CHI -0.80 -3.4  -2.4
A.J. Price  2009-10   23  IND -0.75 -2.2  -1.1
            2010-11   24  IND -1.51 -3.1  -1.0
            2011-12   25  IND -0.35 -2.2  -1.4

I'm new to pandas and relatively new to Python altogether, so this is likely a simple question but I'm not sure how to even approach it since every player's start year is different.我是 pandas 的新手,对 Python 来说相对较新,所以这可能是一个简单的问题,但我不知道如何解决它,因为每个玩家的开始年份都不同。

You can use the split/apply/combine pattern with groupby and cumcount .您可以将 split/apply/combine 模式与 groupby 和cumcount 一起使用 The cumcount acts as a transform which returns a series with the same index as the original dataframe in contrast with an aggregation (like mean) which returns one value for each group. cumcount 充当转换,它返回与原始 dataframe 具有相同索引的系列,而聚合(如平均值)为每个组返回一个值。

df['career_year'] = df.groupby(level='Player').cumcount()

With your data, this will give使用您的数据,这将给出

                     Age   Tm  OBPM  BPM  DBPM  career_year
Player      Season                                         
A.J. Guyton 2000-01   22  CHI -0.57 -2.8  -2.1            0
            2001-02   23  CHI -0.80 -3.4  -2.4            1
A.J. Price  2009-10   23  IND -0.75 -2.2  -1.1            0
            2010-11   24  IND -1.51 -3.1  -1.0            1
            2011-12   25  IND -0.35 -2.2  -1.4            2

you should include code for how to generate your sample data.您应该包含有关如何生成示例数据的代码。 Makes it easier for others to help you.让别人更容易帮助你。

dataframe['Season'] = 2

will create a new column 'Season' and populate it with 2.将创建一个新列“季节”并用 2 填充它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM