[英]how do i compare date columns in two different data frames based on the same ID
pandas 大熊猫
I have two data frames and want to do a nested loop. 我有两个数据框,想要做一个嵌套循环。
I want to iterate of each row from df1 and select col1 (id) and col2. 我想从df1迭代每一行,然后选择col1(id)和col2。
Then, it will take the ID and iterate through df2 and check if the row has the same ID and then compare date column from df1 with date column in df2 然后,它将获取该ID并遍历df2,并检查该行是否具有相同的ID,然后将df1中的date列与df2中的date列进行比较
if col2 in df1 is less than col2 in df2, it will return True and append that to the row of df1. 如果df1中的col2小于df2中的col2,它将返回True并将其附加到df1的行中。
essentially what i'm trying to do is or, if there's a faster way 本质上我想做的是,或者,如果有更快的方法
for(row : df1){
for(row : df2){
if (df1.row[col1] == df2.row[col1]){
if(df1.row[col2] < df2.row[col2])
return df1.row[col3] == True
else
row[col3] == False
df1
col1 col2 col3 col4
01 01/01/2018 S True
02 11/21/2018 F False
03 04/03/2018 C True
df2
col1 col2 col3
01 10/01/2018 A
02 01/01/2018 A
02 01/31/2018 F
02 10/01/2018 D
02 09/01/2018 V
03 02/01/2018 W
03 07/01/2018 X
pandas.merge_asof
First, for merge_asof
to work, you need to sort by the dates 首先,为了使
merge_asof
正常工作,您需要按日期排序
df1.sort_values(['col2', 'col1'], inplace=True)
df2.sort_values(['col2', 'col1'], inplace=True)
Now we can merge 现在我们可以合并
pd.merge_asof(
df1, df2.rename(columns={'col3': 'col4'}),
on='col2', by='col1', direction='forward'
).assign(col4=lambda d: d.col4.notna())
col1 col2 col3 col4
0 1 2018-01-01 S True
1 3 2018-04-03 C True
2 2 2018-11-21 F False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.