[英]Calculate Columns in Pandas based on multiple rows
我有一個數據框包含以下列:
我想添加一列以計算此事件是否在上一個事務中發生
items = pd.DataFrame({'event':['A','B','B','A','C','C','C'],
'transaction_ID':[1,2,3,4,5,6,7],
'previous_trans':[2,3,5,7,4,1,6]})
items["Same_Event_in_prev_trans"]=0
“ Same_Event_in_prev_trans”列的值應為0 1 0 0 0 0 1
我不確定如何不使用for循環。
謝謝。
您可以使用lambda來檢查上一次交易的事件。
items["Same_Event_in_prev_trans"]=(
items.apply(lambda x: 1 if x.event==items.set_index('transaction_ID')
.loc[x.previous_trans,'event']
else 0, axis=1)
)
items
Out[239]:
event previous_trans transaction_ID Same_Event_in_prev_trans
0 A 2 1 0
1 B 3 2 1
2 B 5 3 0
3 A 7 4 0
4 C 4 5 0
5 C 1 6 0
6 C 6 7 1
並不完全確定邏輯,但是檢查每個事件的previous_trans是否在transaction_ID集合內似乎可以提供所需的輸出:
items["Same_Event_in_prev_trans"] = (items.groupby('event', group_keys=False)
.apply(lambda g: g.previous_trans.isin(g.transaction_ID))
.astype(int))
items
# event previous_trans transaction_ID Same_Event_in_prev_trans
#0 A 2 1 0
#1 B 3 2 1
#2 B 5 3 0
#3 A 7 4 0
#4 C 4 5 0
#5 C 1 6 0
#6 C 6 7 1
怎么樣
>> items['prev_event'] = pd.merge(items, items[['event', 'transaction_ID']],
>> left_on='previous_trans',
>> right_on='transaction_ID')['event_y']
>> items['same_event'] = (items['event'] == items['prev_event']).astype(int)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.