![](/img/trans.png)
[英]How to subtract cell values from one column with cell values from another column in xlsx files using python
[英]How to subtract values in a column using groupby
我有以下數據框:
ID Days TreatmentGiven TreatmentNumber
--- ---- -------------- ---------------
1 0 False NaN
1 30 False NaN
1 40 True 1
1 56 False NaN
2 0 False NaN
2 14 True 1
2 28 True 2
我想根據第一次治療的時間 (TreatmentNumber==1) 創建一個新的列,其中包含天數的新基線,按 ID 分組,結果如下:
ID Days TreatmentGiven TreatmentNumber New_Baseline
--- ---- -------------- --------------- ------------
1 0 False NaN -40
1 30 False NaN -10
1 40 True 1 0
1 56 False NaN 16
2 0 False NaN -14
2 14 True 1 0
2 28 True 2 14
做這個的最好方式是什么?
謝謝你。
想法是在TreatmentNumber
使用1
過濾行,然后通過ID
將Series.map
轉換為Series
,用於減去帶有Series.sub
的Days
列:
s = df[df['TreatmentNumber'].eq(1)].set_index('ID')['Days']
#Series created by first True rows by TreatmentGiven per groups
#s = df[df['TreatmentGiven']].drop_duplicates('ID').set_index('ID')['Days']
df['New_Baseline'] = df['Days'].sub(df['ID'].map(s))
print (df)
ID Days TreatmentGiven TreatmentNumber New_Baseline
0 1 0 False NaN -40
1 1 30 False NaN -10
2 1 40 True 1.0 0
3 1 56 False NaN 16
4 2 0 False NaN -14
5 2 14 True 1.0 0
6 2 28 True 2.0 14
詳情:
print (s)
ID
1 40
2 14
Name: Days, dtype: int64
print (df['ID'].map(s))
0 40
1 40
2 40
3 40
4 14
5 14
6 14
Name: ID, dtype: int64
這是series.where
+ groupby+transform
一種方法:
s = df['Days'].where(df['TreatmentGiven']).groupby(df['ID']).transform('first')
df['New_Baseline'] = df['Days'].sub(s)
輸出
ID Days TreatmentGiven TreatmentNumber New_Baseline
0 1 0 False NaN -40.0
1 1 30 False NaN -10.0
2 1 40 True 1.0 0.0
3 1 56 False NaN 16.0
4 2 0 False NaN -14.0
5 2 14 True 1.0 0.0
6 2 28 True 2.0 14.0
這是另一種方法:
aux = df[df['TreatmentGiven']==True].groupby('ID')['Days'].first().reset_index()
df = df.merge(aux,how='left',on='ID').rename(columns={'Days_x':'Days','Days_y':'New_baseline'})
df['New_baseline'] = df['Days'] - df['New_baseline']
輸出:
ID Days TreatmentGiven TreatMentNumber New_baseline
0 1 0 False NaN -40
1 1 30 False NaN -10
2 1 40 True 1.0 0
3 1 56 False NaN 16
4 2 0 False NaN -14
5 2 14 True 1.0 0
6 2 28 True 2.0 14
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.