在 dataframe 列中查找第二个最近的日期

Question

I have data for example:我有数据例如：

Sampled_Date采样日期
8/29/2017 2017 年 8 月 29 日
8/29/2017 2017 年 8 月 29 日
8/29/2017 2017 年 8 月 29 日
2/28/2016 2016 年 2 月 28 日
2/28/2016 2016 年 2 月 28 日
5/15/2014 2014 年 5 月 15 日

Etc.. Now I can find max and min dates as等等..现在我可以找到最大和最小日期

df.Sampled_Date.max()
df.Sampled_Date.min()

But how to find the second most recent date.但是如何找到第二个最近的日期。 ie 2/28/2016 in Python's pandas data frame.即 Python 的 pandas 数据帧中的 2/28/2016。

Answer 1

Make sure your dates are in datetime first: 确保您的日期首先在日期时间：

df['Sampled_Date'] = pd.to_datetime(df['Sampled_Date'])

Then drop the duplicates, take the nlargest(2) , and take the last value of that: 然后删除重复项，取nlargest(2) ，并取最后一个值：

df['Sampled_Date'].drop_duplicates().nlargest(2).iloc[-1]

# Timestamp('2016-02-28 00:00:00')

Answer 2

You can also use .argsort() 你也可以使用.argsort()

import pandas as pd

# Generate dates
dates = pd.Series(pd.date_range(start='1/1/2017', periods=5, freq=pd.offsets.MonthEnd(3)))

# Random order
dates = dates.sample(frac=1, random_state=0)

# Get the second 'max' date
dates[dates.argsort() == (len(dates)-2)] # 3   2017-10-31

Answer 3

I know this is an extension of the question, but it's something I frequently need and sometimes forget, so I'm sharing here: 我知道这是问题的延伸，但这是我经常需要的，有时会忘记，所以我在这里分享：

Let's say instead of just wanting the second most recent or second earliest dates for an entire dataframe, you have a dataframe of users and dates, and you want to get the second earliest date for each user (eg their second transaction). 假设您不是只想要整个数据帧的第二个最近或第二个最早的日期，而是拥有用户和日期的数据框，并且您希望获得每个用户的第二个最早日期（例如，他们的第二个交易）。

Example dataframe: 示例数据帧：

test = pd.DataFrame()
test['users'] = [1,2,3,2,3,2]
test['dates'] = pd.to_datetime(['2019-01-01','2019-01-01',
                                '2019-01-02','2019-01-02',
                                '2019-01-03','2019-01-04'])

The earliest date for user 2 is '2019-01-01' and the second earliest date is '20-19-01-02'. 用户2的最早日期是'2019-01-01'，第二个最早的日期是'20 -19-01-02'。 We can use groupby, apply, and nlargest/nsmallest: 我们可以使用groupby，apply和nlargest / nsmallest：

test.groupby('users')['dates'].apply(lambda x: x.nsmallest(2).max())

which gives us this output: 这给了我们这个输出：

users
1   2019-01-01
2   2019-01-02
3   2019-01-03
Name: dates, dtype: datetime64[ns]

Answer 4

# second more recent date
df.Sampled_Date.sort_values(ascending=False).iloc[1]

在 dataframe 列中查找第二个最近的日期

问题描述

3 个解决方案

解决方案1
2 2018-09-22 17:35:04

解决方案2
1 已采纳 2018-09-22 17:45:56

解决方案3
0 2019-08-08 07:46:49

解决方案4
0 2022-07-27 00:22:51

在 dataframe 列中查找第二个最近的日期

问题描述

3 个解决方案

解决方案1 2 2018-09-22 17:35:04

解决方案2 1 已采纳 2018-09-22 17:45:56

解决方案3 0 2019-08-08 07:46:49

解决方案4 0 2022-07-27 00:22:51

解决方案1
2 2018-09-22 17:35:04

解决方案2
1 已采纳 2018-09-22 17:45:56

解决方案3
0 2019-08-08 07:46:49

解决方案4
0 2022-07-27 00:22:51