简体   繁体   English

Pandas 多索引 dataframe 累计和

[英]Pandas multi-index dataframe cumulative sum

I have a multi-index dataframe below:我在下面有一个多索引 dataframe:

vessels_df.head(10)

                             eta_date
vessel           load_dates  
AM OCEAN SILVER  2020-06-05  2020-06-04
                 2020-06-06 
                 2020-06-07 
                 2020-06-08 
                 2020-06-09 
                 2020-06-10 
                 2020-06-11 
APJ ANGAD 
                 2020-06-09  2020-06-08
                 2020-06-10 
                 2020-06-11 
AQUATONKA       
                 2020-06-03  2020-06-02
                 2020-06-04 
                 2020-06-05 
                 2020-06-06 
                 2020-06-07 
                 2020-06-08 
                 2020-06-09 
                 2020-06-10 
                 2020-06-11 

and a dictionary with a list of daily charges that are incurred for every day beyond the FIRST day of the load_dates和一本字典,其中列出了在load_dates的第一天之后的每一天产生的每日费用

demurrage_charges_dict = {
    'AM OCEAN SILVER': 11076,
    'APJ ANGAD': 21771,
    'AQUATONKA': 14312
}

desired output所需 output

I would like to create a column that is the cumulative sum of the daily charge over the period in the index, for example:我想创建一个列,该列是索引中该期间每日费用的累积总和,例如:

                            eta_date      demurrage_charges
vessel           load_dates  
AM OCEAN SILVER  2020-06-05  2020-06-04   0
                 2020-06-06               11,076
                 2020-06-07               22,152
                 2020-06-08               33,228
                 2020-06-09               44,304
                 2020-06-10               55,380
                 2020-06-11               66,456

I believe I could reset the index of the 'vessels_df' , convert the demurrage_charges_dict to df and merge the two then use pd.cumsum() , but wondered if there is a more elegant way to perform this?我相信我可以重置'vessels_df'的索引,将demurrage_charges_dict转换为 df 并将两者合并,然后使用pd.cumsum() ,但想知道是否有更优雅的方式来执行此操作?

Much appreciated.非常感激。

cumcount the vessel index level and multiply that by the mapping of the vessel with the dict cumcount容器索引级别并将其乘以容器与 dict 的映射

idx = df.index.get_level_values('vessel')
df['demurrage_charges'] = (df.groupby(idx).cumcount()
                           * idx.map(demurrage_charges_dict))

                              eta_date  demurrage_charges
vessel          load_dates                               
AM OCEAN SILVER 2020-06-05  2020-06-04                  0
                2020-06-06        None              11076
                2020-06-07        None              22152
                2020-06-08        None              33228
                2020-06-09        None              44304
                2020-06-10        None              55380
                2020-06-11        None              66456
APJ ANGAD       2020-06-09  2020-06-08                  0
                2020-06-10        None              21771
                2020-06-11        None              43542
AQUATONKA       2020-06-03  2020-06-02                  0
                2020-06-04        None              14312
                2020-06-05        None              28624
                2020-06-06        None              42936
                2020-06-07        None              57248
                2020-06-08        None              71560
                2020-06-09        None              85872
                2020-06-10        None             100184
                2020-06-11        None             114496

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM