Pandas 多索引 dataframe 累计和

Question

I have a multi-index dataframe below:我在下面有一个多索引 dataframe：

vessels_df.head(10)

                             eta_date
vessel           load_dates  
AM OCEAN SILVER  2020-06-05  2020-06-04
                 2020-06-06 
                 2020-06-07 
                 2020-06-08 
                 2020-06-09 
                 2020-06-10 
                 2020-06-11 
APJ ANGAD 
                 2020-06-09  2020-06-08
                 2020-06-10 
                 2020-06-11 
AQUATONKA       
                 2020-06-03  2020-06-02
                 2020-06-04 
                 2020-06-05 
                 2020-06-06 
                 2020-06-07 
                 2020-06-08 
                 2020-06-09 
                 2020-06-10 
                 2020-06-11

and a dictionary with a list of daily charges that are incurred for every day beyond the FIRST day of the load_dates和一本字典，其中列出了在load_dates的第一天之后的每一天产生的每日费用

demurrage_charges_dict = {
    'AM OCEAN SILVER': 11076,
    'APJ ANGAD': 21771,
    'AQUATONKA': 14312
}

desired output所需 output

I would like to create a column that is the cumulative sum of the daily charge over the period in the index, for example:我想创建一个列，该列是索引中该期间每日费用的累积总和，例如：

                            eta_date      demurrage_charges
vessel           load_dates  
AM OCEAN SILVER  2020-06-05  2020-06-04   0
                 2020-06-06               11,076
                 2020-06-07               22,152
                 2020-06-08               33,228
                 2020-06-09               44,304
                 2020-06-10               55,380
                 2020-06-11               66,456

I believe I could reset the index of the 'vessels_df' , convert the demurrage_charges_dict to df and merge the two then use pd.cumsum() , but wondered if there is a more elegant way to perform this?我相信我可以重置'vessels_df'的索引，将demurrage_charges_dict转换为 df 并将两者合并，然后使用pd.cumsum() ，但想知道是否有更优雅的方式来执行此操作？

Much appreciated.非常感激。

Answer 1

cumcount the vessel index level and multiply that by the mapping of the vessel with the dict cumcount容器索引级别并将其乘以容器与 dict 的映射

idx = df.index.get_level_values('vessel')
df['demurrage_charges'] = (df.groupby(idx).cumcount()
                           * idx.map(demurrage_charges_dict))

                              eta_date  demurrage_charges
vessel          load_dates                               
AM OCEAN SILVER 2020-06-05  2020-06-04                  0
                2020-06-06        None              11076
                2020-06-07        None              22152
                2020-06-08        None              33228
                2020-06-09        None              44304
                2020-06-10        None              55380
                2020-06-11        None              66456
APJ ANGAD       2020-06-09  2020-06-08                  0
                2020-06-10        None              21771
                2020-06-11        None              43542
AQUATONKA       2020-06-03  2020-06-02                  0
                2020-06-04        None              14312
                2020-06-05        None              28624
                2020-06-06        None              42936
                2020-06-07        None              57248
                2020-06-08        None              71560
                2020-06-09        None              85872
                2020-06-10        None             100184
                2020-06-11        None             114496

Pandas 多索引 dataframe 累计和

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-06-19 17:07:17

Pandas 多索引 dataframe 累计和

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-06-19 17:07:17

解决方案1
3 已采纳 2020-06-19 17:07:17