简体   繁体   中英

Adding a new column that the values are determined by another column (after groupby)

This is the original dataframe looks like , and I want to add a new column called [withdraw_#], which recorded how many times the parent_user_id withdraw their money [I don't know the steps after I groupby('parent_user_id')]

This is the revised dataframe looks like

df['WITHDRAW_#']='' 
df['WITHDRAW_#']=df.groupby(['user','side']).transform('count').fillna(0)
df['WITHDRAW_#']=df['WITHDRAW_#'].fillna(0).astype(int)
print(df)

Input:

    user      side amount
0  10067  WITHDRAW   2000
1  10057   DEPOSIT   5000
2  10067  WITHDRAW   1000
3  10057  WITHDRAW   6000

Output:

    user      side amount  WITHDRAW_#
0  10067  WITHDRAW   2000           2
1  10057   DEPOSIT   5000           1
2  10067  WITHDRAW   1000           2
3  10057  WITHDRAW   6000           1
def fun(x):
    return sum(x=="WITHDRAW")
df["withdraw_#"] = df.groupby("user_id")["side"].agg(fun)

or

df["withdraw_#"] = df.groupby("user_id")["side"].agg(lambda x: sum(x=="WITHDRAW"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM