I have a df with user journeys that show purchase amounts of products. Now, I want to fill the last non-null value for each user, since users do not buy every day. currently, I have:
date | user_id | purchase_value
2020-01-01 | 1 | null
2020-01-02 | 1 | 1
2020-01-03 | 1 | null
2020-01-04 | 1 | 4
2020-01-01 | 2 | 55
2020-01-02 | 2 | null
I want it to look like this:
date | user_id | purchase_value
2020-01-01 | 1 | null
2020-01-02 | 1 | 1
2020-01-03 | 1 | 1
2020-01-04 | 1 | 4
2020-01-01 | 2 | 55
2020-01-02 | 2 | 55
Explanation: For user 1, we fill 1 on 2020-01-03 since this was the last non-null value on 2020-01-02. For user 2, we fill in 55 on 2020-01-02 since this was the last non-null value on 2020-01-01.
How would I do this in pandas for each user_id and date? Also, the dates do not have to be sequential. ie there can be gaps in the dates, in that case always fill in the last non-null value (whenever that was).
If you really want to ffill
only the last NaN per group you need to identify it, then replace with its ffill
:
# is the value NaN?
m1 = df['purchase_value'].isna()
# is this the last NaN of the group?
# here: is this the first NaN of the group in reverse?
m2 = m1[::-1].groupby(df['user_id']).cumsum().eq(1)
# then replace with the ffill per group
df.loc[m1&m2, 'purchase_value'] = df.groupby(['user_id'])['purchase_value'].ffill()
Output:
date user_id purchase_value
0 2020-01-01 1 NaN
1 2020-01-02 1 1.0
2 2020-01-03 1 1.0
3 2020-01-04 1 4.0
4 2020-01-01 2 55.0
5 2020-01-02 2 55.0
Another possible solution:
df['aux'] = (
df.assign(aux = pd.isna(df.purchase_value))
.groupby('user_id')['aux'].cumsum())
(df.assign(
purchase_value =
np.where((pd.isna(df.purchase_value)) & (df.aux == df.groupby('user_id')['aux']
.transform('max')), df.purchase_value.shift(1), df.purchase_value))
.drop('aux', axis = 1))
Output:
date user_id purchase_value
0 2020-01-01 1 NaN
1 2020-01-02 1 1.0
2 2020-01-03 1 1.0
3 2020-01-04 1 4.0
4 2020-01-01 2 55.0
5 2020-01-02 2 55.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.