如何使用 python 根据给定日期获取最新日期？

Question

Consider the following two dataframes:考虑以下两个数据框：

Dataframe1 contains a list of users and stop_dates Dataframe1 包含用户列表和 stop_dates

Dataframe2 contains a history of user transactions and dates Dataframe2 包含用户交易和日期的历史记录

I want to get the last transaction date before the stop date for all users in Dataframe1 (some users in Dataframe1 have multiple stop dates)我想获取 Dataframe1 中所有用户的停止日期之前的最后交易日期（Dataframe1 中的某些用户有多个停止日期）

I want the output to look like the following:我希望 output 如下所示：

Answer 1

Here is one way to accomplish (make sure both date columns are already datetime):这是完成的一种方法（确保两个日期列都已经是日期时间）：

df = pd.merge(df1, df2, on="UserID")

df["Last_Before_Stop"] = df["Stop_Date"].apply(
    lambda x: max(df["Transaction_Date"][df["Transaction_Date"] < x]) if
    len(df["Transaction_Date"][df["Transaction_Date"] < x]) != 0 else
    pd.nan
)

Answer 2

Please always provide data in a form that makes it easy to use as samples (ie as text, not as images - see here ).请始终以易于用作样本的形式提供数据（即作为文本，而不是作为图像 - 参见此处）。

You could try:你可以试试：

df1["Stop_Date"] = pd.to_datetime(df1["Stop_Date"], format="%m/%d/%y")
df2["Transaction_Date"] = pd.to_datetime(df2["Transaction_Date"], format="%m/%d/%y")
df = (
    df1.merge(df2, on="UserID", how="left")
    .loc[lambda df: df["Stop_Date"] >= df["Transaction_Date"]]
    .groupby(["UserID", "Stop_Date"])["Transaction_Date"].max()
    .to_frame().reset_index().drop(columns="Stop_Date")
)

Make datetime s out of the date columns.使datetime脱离日期列。
Merge df2 on df1 along UserID .沿UserID合并df1上的df2 。
Remove the rows which have a Transaction_Date greater than Stop_Date .删除Transaction_Date大于Stop_Date的行。
Group the result by UserID and Stop_Date, and fetch the maximum Transaction_Date .按UserID和Stop_Date,并获取最大Transaction_Date 。
Bring the result in shape.使结果成形。

Result for结果为

df1 : df1 ：

   UserID Stop_Date
0       1    2/2/22
1       2    6/9/22
2       3   7/25/22
3       3   9/14/22

df2 : df2 ：

   UserID Transaction_Date
0       1           1/2/22
1       1           2/1/22
2       1           2/3/22
3       2          1/24/22
4       2          3/22/22
5       3          6/25/22
6       3          7/20/22
7       3          9/13/22
8       3          9/14/22
9       4           2/2/22

is是

   UserID Transaction_Date
0       1       2022-02-01
1       2       2022-03-22
2       3       2022-07-20
3       3       2022-09-14

If you don't want to permanently change the dtype to datetime , and also want the result as string, similarly formatted as the input (with padding), then you could try:如果您不想将dtype永久更改为datetime ，并且还希望结果为字符串，格式与输入类似（带填充），那么您可以尝试：

df = (
    df1
    .assign(Stop_Date=pd.to_datetime(df1["Stop_Date"], format="%m/%d/%y"))
    .merge(
        df2.assign(Transaction_Date=pd.to_datetime(df2["Transaction_Date"], format="%m/%d/%y")),
        on="UserID", how="left"
    )
    .loc[lambda df: df["Stop_Date"] >= df["Transaction_Date"]]
    .groupby(["UserID", "Stop_Date"])["Transaction_Date"].max()
    .to_frame().reset_index().drop(columns="Stop_Date")
    .assign(Transaction_Date=lambda df: df["Transaction_Date"].dt.strftime("%m/%d/%y"))
)

Result:结果：

   UserID Transaction_Date
0       1         02/01/22
1       2         03/22/22
2       3         07/20/22
3       3         09/14/22

如何使用 python 根据给定日期获取最新日期？

问题描述

2 个解决方案

解决方案1
0 2022-09-14 21:45:37

解决方案2
0 2022-09-15 09:50:51

如何使用 python 根据给定日期获取最新日期？

问题描述

2 个解决方案

解决方案1 0 2022-09-14 21:45:37

解决方案2 0 2022-09-15 09:50:51

解决方案1
0 2022-09-14 21:45:37

解决方案2
0 2022-09-15 09:50:51