I have a pandas dataframe with purchase date of of each customer. I want to find out most recent purchase date and second most recent purchase date of each unique customer. Here is my dataframe:
name date
ab1 6/1/18
ab1 6/2/18
ab1 6/3/18
ab1 6/4/18
ab2 6/8/18
ab2 6/9/18
ab3 6/23/18
I am expecting the following output:
name second most recent date most recent date
ab1 6/3/18 6/4/18
ab2 6/8/18 6/9/18
ab3 6/23/18 6/23/18
I know data['date'].max()
can give the most recent purchase date but I don't have any idea how I can find the second most recent date. Any help will be highly appreciated.
To get the two most recent purchase date for each customer, you can first sort your dataframe in descending order by date, then groupby the name and convert the aggregated dates into individual columns. Finally just take the first two of these columns and you'll have just the two most recent purchase dates for each customer.
Here's an example:
import pandas as pd
# set up data from your example
df = pd.DataFrame({
"name": ["ab1", "ab1", "ab1", "ab1", "ab2", "ab2", "ab3"],
"date": ["6/1/18", "6/2/18", "6/3/18", "6/4/18", "6/8/18", "6/9/18", "6/23/18"]
})
# create column of datetimes (for sorting reverse-chronologically)
df["datetime"] = pd.to_datetime(df.date)
# group by name and convert dates into individual columns
grouped_df = df.sort_values(
"datetime", ascending=False
).groupby("name")["date"].apply(list).apply(pd.Series).reset_index()
# truncate and rename columns
grouped_df = grouped_df[["name", 0, 1]]
grouped_df.columns = ["name", "most_recent", "second_most_recent"]
With grouped_df
like this at the end:
name most_recent second_most_recent
0 ab1 6/4/18 6/3/18
1 ab2 6/9/18 6/8/18
2 ab3 6/23/18 NaN
If you want to fill any missing second_most_recent
values with the corresponding most_recent
value, you can use np.where
. Like this:
import numpy as np
grouped_df["second_most_recent"] = np.where(
grouped_df.second_most_recent.isna(),
grouped_df.most_recent,
grouped_df.second_most_recent
)
With result:
name most_recent second_most_recent
0 ab1 6/4/18 6/3/18
1 ab2 6/9/18 6/8/18
2 ab3 6/23/18 6/23/18
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.