如何检查Pandas中另一个dataframe中是否存在两列的组合值？

Question

I have multiple dfs with two common columns我有多个具有两个公共列的 df

Sample df样本 df

user_id and event_date
abc   |  1st june
abc   |  2nd June
cdf   | 15th july
dfg   | 17th July

I want to check if a user_id on a particular event_date in df1 also exists in df2, df3, df4, and df5我想检查 df1 中特定event_date的user_id是否也存在于 df2、df3、df4 和 df5 中

How do I find this?我怎么找到这个？

the following methods I tried but it worked with only taking " user_id " into consideration and not with " event_date "我尝试了以下方法，但它只考虑了“ user_id ”而不是“ event_date ”

method 1:方法一：

upi_sms =df1.assign(Insms=df2.user_id.isin(df1.user_id).astype(int))

method 2: merging dataframes on = [user_id, event_date]方法 2： on = [user_id, event_date]合并数据帧

none of it gives me expected results.这些都没有给我预期的结果。

Expected Result:预期结果：

Combination of abc and 1st June should exist in df2

How do I achieve this?我如何实现这一目标？

Answer 1

I would do it following way, consider simple example:我会按照以下方式进行，考虑简单的例子：

import pandas as pd
df1 = pd.DataFrame({'x':['A','B','C'],'y':[1,2,3]})
df2 = pd.DataFrame({'x':['C','A','B'],'y':[3,2,1]})
df3 = pd.DataFrame({'x':['A','B','C'],'y':[0,0,0]})

and say you are interested in last row of df1 , ie where x is C and y is 3. Such row is also present in df2 (1st) but not df3 where there is row with x being C but have different.并说您对df1的最后一行感兴趣，即 x 是 C，y 是 3。这样的行也出现在df2 （第 1 行）中，但df3中没有，其中 x 的行是 C 但有不同。

row = tuple(df1.iloc[-1]) # get last row of df1 as tuple
print(row in df2.itertuples(index=False)) # True
print(row in df3.itertuples(index=False)) # False

Observe it is important to pass index=False as we did not want to take into account where number is inside pandas.DataFrame观察传递index=False很重要，因为我们不想考虑数字在pandas.DataFrame中的位置

如何检查Pandas中另一个dataframe中是否存在两列的组合值？

问题描述

1 个解决方案

解决方案1
0 2022-04-13 08:04:37

如何检查Pandas中另一个dataframe中是否存在两列的组合值？

问题描述

1 个解决方案

解决方案1 0 2022-04-13 08:04:37

解决方案1
0 2022-04-13 08:04:37