pandas compare two dataframes with criteria

Question

I have two dataframes. df1 and df2. I would like to get whatever values are common from df1 and df2 and the dt value of df2 must be greater than df1's dt value In this case, the expected value is fee

df1 = pd.DataFrame([['2015-01-01 06:00','foo'], 
    ['2015-01-01 07:00','fee'], ['2015-01-01 08:00','fum']],
    columns=['dt', 'value'])
df1.dt=pd.to_datetime(df1.dt)

df2=pd.DataFrame([['2015-01-01 06:10','zoo'],
    ['2015-01-01 07:10','fee'],['2015-01-01 08:10','feu'],
    ['2015-01-01 09:10','boo']], columns=['dt', 'value'])
df2.dt=pd.to_datetime(df2.dt)

Answer 1

One way would be to merge on 'value' column so this will produce only matching rows, you can then filter the merged df using the 'dt_x', 'dt_y' columns:

In [15]:

merged = df2.merge(df1, on='value')
merged[merged['dt_x'] > merged['dt_y']]
Out[15]:
                 dt_x value                dt_y
0 2015-01-01 07:10:00   fee 2015-01-01 07:00:00

You can't do something like the following because the lengths don't match:

df2[ (df2['value'].isin(df1['value'])) & (df2['dt'] > df1['dt']) ]

raises:

ValueError: Series lengths must match to compare

pandas compare two dataframes with criteria

Question

1 answers

solution1
2 2015-05-22 20:55:01

pandas compare two dataframes with criteria

Question

1 answers

solution1 2 2015-05-22 20:55:01

solution1
2 2015-05-22 20:55:01