[英]Pandas DataFrame add new column values based on group by multiple conditions
I have a DataFrame as below我有一个如下所示的 DataFrame
Color Month Quantity
index
0 1 1 34047
1 1 2 36654
2 2 3 37291
3 2 4 35270
4 3 5 35407
5 1 12 9300
I want to add a more extra column PrevoiousMonthQty
to this Dataframe with the filled values in the Qty
column by the logic that we will group by (Color, Month)
and the Month
is the Previous Month
我想添加更多额外的列
PrevoiousMonthQty
到这个 Dataframe,并根据我们将按(Color, Month)
分组的逻辑在Qty
列中填充值, Month
是Previous Month
The target DataFrame I expected looks like this我期望的目标 DataFrame 看起来像这样
Some logic explanation can be seen as一些逻辑解释可以看作
Any helps would be very much appreciated.任何帮助将不胜感激。
Thank you very much.非常感谢。
Here is a way using Multindex
and map
after finding the previous month:这是在查找上个月后使用
Multindex
和map
一种方法:
prev_month = pd.to_datetime(df['Month'],format='%m').sub(pd.Timedelta(1,unit='m')).dt.month
m = df.set_index(['Color','Month'])['Quantity']
final = (df.assign(Prev_Month_Value=pd.MultiIndex.from_arrays([df['Color'],prev_month])
.map(m).fillna(0)))
#To assign into the existing df,use below code instead of df.assign() which returns a copy
#df['Previous Month Value'] = (pd.MultiIndex.from_arrays([df['Color'],prev_month])
# .map(m).fillna(0)
Output:输出:
Color Month Quantity Prev_Month_Value
index
0 1 1 34047 9300.0
1 1 2 36654 34047.0
2 2 3 37291 0.0
3 2 4 35270 37291.0
4 3 5 35407 0.0
5 1 12 9300 0.0
Details:细节:
Step1 : Find previous month by converting
Month
column to datetime and subtract 1 month usingpd.Timedelta
.步骤 1 :通过将
Month
列转换为 datetime 并使用pd.Timedelta
减去 1 个月来pd.Timedelta
。Step2 : Create a multiindex series with Quantity as value and
Color
andMonth
as index.Step2 :创建一个以数量为值、
Color
和Month
为索引的多索引系列。Step3 : Create a MultiIndex using
Color
andprev_month
series and map it back as new column (also fill nan with 0)步骤 3 :使用
Color
和prev_month
系列创建一个 MultiIndex 并将其映射回新列(也用 0 填充 nan)
Here is another approach using merge
- we'll "merge" on a prv_month
key which we'll assign
inline:这是使用
merge
另一种方法 - 我们将在我们将内联assign
的prv_month
键上“合并”:
df['PreviousQty'] = (df.assign(prv_month=df['Month'].sub(1).where(lambda x: x!=0, 12))
.merge(df,
how='left',
left_on=['Color', 'prv_month'],
right_on=['Color', 'Month'])['Qty_y'].fillna(0))
[out] [出去]
Color Month Qty PreviousQty
0 1 1 34047 9300.0
1 1 2 36654 34047.0
2 2 3 37291 0.0
3 2 4 35270 37291.0
4 3 5 35407 0.0
5 1 12 9300 0.0
Use DataFrame.pivot
for reshape DataFrame and add full months by DataFrame.reindex
:使用
DataFrame.pivot
重塑 DataFrame 并通过DataFrame.reindex
添加整月:
df1 = df.pivot('Color','Month','Oty').reindex(columns=range(1,13))
print (df1)
Month 1 2 3 4 5 6 7 8 9 10 11 \
Color
1 34047.0 36654.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN 37291.0 35270.0 NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN 35407.0 NaN NaN NaN NaN NaN NaN
Month 12
Color
1 9300.0
2 NaN
3 NaN
Then use numpy.roll
with DataFrame.join
:然后将
numpy.roll
与DataFrame.join
numpy.roll
使用:
s = pd.DataFrame(np.roll(df1.to_numpy(), 1, axis=1),
index=df1.index,
columns=df1.columns).stack().rename('Previous Month')
df = df.join(s, on=['Color','Month']).fillna({'Previous Month':0})
print (df)
Index Color Month Oty Previous Month
0 0 1 1 34047 9300.0
1 1 1 2 36654 34047.0
2 2 2 3 37291 0.0
3 3 2 4 35270 37291.0
4 4 3 5 35407 0.0
5 5 1 12 9300 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.