Pandas DataFrame 根据多个条件分组添加新列值

Question

I have a DataFrame as below我有一个如下所示的 DataFrame

       Color  Month  Quantity
index                        
0          1      1     34047
1          1      2     36654
2          2      3     37291
3          2      4     35270
4          3      5     35407
5          1     12      9300

I want to add a more extra column PrevoiousMonthQty to this Dataframe with the filled values in the Qty column by the logic that we will group by (Color, Month) and the Month is the Previous Month我想添加更多额外的列PrevoiousMonthQty到这个 Dataframe，并根据我们将按(Color, Month)分组的逻辑在Qty列中填充值， Month是Previous Month

The target DataFrame I expected looks like this我期望的目标 DataFrame 看起来像这样

Some logic explanation can be seen as一些逻辑解释可以看作

Any helps would be very much appreciated.任何帮助将不胜感激。

Thank you very much.非常感谢。

Answer 1

Here is a way using Multindex and map after finding the previous month:这是在查找上个月后使用Multindex和map一种方法：

prev_month = pd.to_datetime(df['Month'],format='%m').sub(pd.Timedelta(1,unit='m')).dt.month

m = df.set_index(['Color','Month'])['Quantity']

final = (df.assign(Prev_Month_Value=pd.MultiIndex.from_arrays([df['Color'],prev_month])
                                                          .map(m).fillna(0)))

#To assign into the existing df,use below code instead of df.assign() which returns a copy
#df['Previous Month Value'] = (pd.MultiIndex.from_arrays([df['Color'],prev_month])
#                                                              .map(m).fillna(0)

Output:输出：

       Color  Month  Quantity  Prev_Month_Value
index                                          
0          1      1     34047            9300.0
1          1      2     36654           34047.0
2          2      3     37291               0.0
3          2      4     35270           37291.0
4          3      5     35407               0.0
5          1     12      9300               0.0

Details:细节：

Step1 : Find previous month by converting Month column to datetime and subtract 1 month using pd.Timedelta .步骤 1 ：通过将Month列转换为 datetime 并使用pd.Timedelta减去 1 个月来pd.Timedelta 。

Step2 : Create a multiindex series with Quantity as value and Color and Month as index. Step2 ：创建一个以数量为值、 Color和Month为索引的多索引系列。

Step3 : Create a MultiIndex using Color and prev_month series and map it back as new column (also fill nan with 0)步骤 3 ：使用Color和prev_month系列创建一个 MultiIndex 并将其映射回新列（也用 0 填充 nan）

Answer 2

Here is another approach using merge - we'll "merge" on a prv_month key which we'll assign inline:这是使用merge另一种方法 - 我们将在我们将内联assign的prv_month键上“合并”：

df['PreviousQty'] = (df.assign(prv_month=df['Month'].sub(1).where(lambda x: x!=0, 12))
                     .merge(df,
                            how='left',
                            left_on=['Color', 'prv_month'],
                            right_on=['Color', 'Month'])['Qty_y'].fillna(0))

[out] [出去]

   Color  Month    Qty  PreviousQty
0      1      1  34047       9300.0
1      1      2  36654      34047.0
2      2      3  37291          0.0
3      2      4  35270      37291.0
4      3      5  35407          0.0
5      1     12   9300          0.0

Answer 3

Use DataFrame.pivot for reshape DataFrame and add full months by DataFrame.reindex :使用DataFrame.pivot重塑 DataFrame 并通过DataFrame.reindex添加整月：

df1 = df.pivot('Color','Month','Oty').reindex(columns=range(1,13))
print (df1)
Month        1        2        3        4        5   6   7   8   9  10  11  \
Color                                                                        
1      34047.0  36654.0      NaN      NaN      NaN NaN NaN NaN NaN NaN NaN   
2          NaN      NaN  37291.0  35270.0      NaN NaN NaN NaN NaN NaN NaN   
3          NaN      NaN      NaN      NaN  35407.0 NaN NaN NaN NaN NaN NaN   

Month      12  
Color          
1      9300.0  
2         NaN  
3         NaN

Then use numpy.roll with DataFrame.join :然后将numpy.roll与DataFrame.join numpy.roll使用：

s = pd.DataFrame(np.roll(df1.to_numpy(), 1, axis=1), 
                 index=df1.index, 
                 columns=df1.columns).stack().rename('Previous Month')

df = df.join(s, on=['Color','Month']).fillna({'Previous Month':0})
print (df)
   Index  Color  Month    Oty  Previous Month
0      0      1      1  34047          9300.0
1      1      1      2  36654         34047.0
2      2      2      3  37291             0.0
3      3      2      4  35270         37291.0
4      4      3      5  35407             0.0
5      5      1     12   9300             0.0

Pandas DataFrame 根据多个条件分组添加新列值

问题描述

3 个解决方案

解决方案1
2 已采纳 2020-01-16 08:25:58

解决方案2
2 2020-01-16 08:28:58

解决方案3
1 2020-01-16 08:17:15

Pandas DataFrame 根据多个条件分组添加新列值

问题描述

3 个解决方案

解决方案1 2 已采纳 2020-01-16 08:25:58

解决方案2 2 2020-01-16 08:28:58

解决方案3 1 2020-01-16 08:17:15

解决方案1
2 已采纳 2020-01-16 08:25:58

解决方案2
2 2020-01-16 08:28:58

解决方案3
1 2020-01-16 08:17:15