根据日期和时间条件删除行 pandas dataframe

Question

I have two DataFrame as per the below code.根据以下代码，我有两个 DataFrame。

Key_DF = pd.DataFrame({'TC': {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'F', 5: 'G'}, 'D_time': {0: '2/5/2021 10:00', 1: '2/5/2021 22:00', 2: '2/7/2021 11:35', 3: '2/8/2021 11:35', 4: '2/9/2021 11:35', 5: '2/10/2021 11:35'}, 'FName': {0: 'A', 1: 'B', 2: 'C', 3: 'D', 4: 'A', 5: 'B'}})

Main_DF = pd.DataFrame({'Test Case': {0: 'A', 1: 'A', 2: 'B', 3: 'D', 4: 'D', 5: 'G', 6: 'G'}, 'Timestamp': {0: datetime.datetime(2021, 2, 5, 9, 34, 25), 1: datetime.datetime(2021, 2, 5, 14, 34, 25), 2: 'Wed Nov 25 17:30:12 2020', 3: '11/30/2020 11:48:38 AM', 4: 'Mon Feb 8 13:39:00 2021', 5: 'Mon Feb 9 15:42:50 2021', 6: 'Wed Dec  2 14:56:26 2020'}})

Key_DF.D_time = pd.to_datetime(Key_DF.D_time)
Main_DF.Timestamp = pd.to_datetime(Main_DF.Timestamp)
print (Key_DF)
print (Main_DF)

Need to do the following operations with "Main_DF".需要对“Main_DF”进行以下操作。

Pick up Data of column of Key_DF (Ex: "1-1.1" & "2/5/2021 10:00")提取Key_DF列的数据（例如：“1-1.1”和“2/5/2021 10:00”）
Match Number of Key_DF(Ex: "1-1.1") with Main_DF Key_DF 的匹配数（例如：“1-1.1”）与Main_DF
Remove entries where Main_DF.Timestamp > Key_DF.D_time删除Main_DF.Timestamp > Key_DF.D_time条目
Fresh filtered_Df from Main_DF .来自 Main_DF 的新过滤的Main_DF 。

The final output should be, as per the following, where Main_DF.Timestamp > Key_DF.D_time condition should be satisfied.最终的 output 应如下所示，其中Main_DF.Timestamp > Key_DF.D_time条件应满足。

I am ok with any format of Timestamp column here.我可以在这里使用任何格式的时间戳列。

Answer 1

In order to compare datetimes, they must be a datetime64[ns] dtype为了比较日期时间，它们必须是datetime64[ns] dtype
- Check the dtypes with .info()使用.info()检查dtypes
The dataframes can be merged on 'TC' and 'Test Case'数据框可以在'TC'和'Test Case'上合并
- So the 'TC' column isn't added as a separate column when merging dataframes, it will be renamed to 'Test Case'因此，在合并数据框时， 'TC'列不会作为单独的列添加，它将被重命名为'Test Case'
After merging the dataframes, use Boolean selection with df.Timestamp <= df.D_time or df.D_time.isna()合并数据帧后，使用 Boolean 选择与df.Timestamp <= df.D_time或df.D_time.isna()
- df.D_time.isna() will keep rows where the 'Timestamp' column has no matching time in the 'D_time column. df.D_time.isna()将保留'Timestamp'列在'D_time列中没有匹配时间的行。
- Removing values where Main_DF.Timestamp > Key_DF.D_time is the same as keeping values where df.Timestamp <= df.D_time删除Main_DF.Timestamp > Key_DF.D_time的值与保留df.Timestamp <= df.D_time的值相同
The final output should have both rows with 'G' .最后的 output 应该有两行'G' 。
This assumes unique values in the 'TC' column, as shown in the OP这假定'TC'列中的唯一值，如 OP 中所示
Also, nothing in the OP mentions the 'FName' column, so it is disregarded.此外，OP 中没有提到'FName'列，因此它被忽略了。

# merged the two dataframes
df = Main_DF.merge(Key_DF[['TC', 'D_time']].rename(columns={'TC': 'Test Case'}), on='Test Case', how='left')

# display(df)
  Test Case           Timestamp              D_time
0         A 2021-02-05 09:34:25 2021-02-05 10:00:00
1         A 2021-02-05 14:34:25 2021-02-05 10:00:00
2         B 2020-11-25 17:30:12 2021-02-05 22:00:00
3         D 2020-11-30 11:48:38 2021-02-08 11:35:00
4         D 2021-02-08 13:39:00 2021-02-08 11:35:00
5         G 2021-02-09 15:42:50 2021-02-10 11:35:00
6         G 2020-12-02 14:56:26 2021-02-10 11:35:00

# filter the dataframe to keep data where Timestame is <= to D_time
df = df[(df.Timestamp <= df.D_time) | df.D_time.isna()].drop(columns=['D_time']).reset_index(drop=True)

# display(df)
  Test Case           Timestamp
0         A 2021-02-05 09:34:25
1         B 2020-11-25 17:30:12
2         D 2020-11-30 11:48:38
3         G 2021-02-09 15:42:50
4         G 2020-12-02 14:56:26

根据日期和时间条件删除行 pandas dataframe

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-02-08 19:48:54

根据日期和时间条件删除行 pandas dataframe

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-02-08 19:48:54

解决方案1
1 已采纳 2021-02-08 19:48:54