简体   繁体   English

检查一个 dataframe 中的列对是否存在于另一个中?

[英]Check if column pair in one dataframe exists in another?

d1 = {'id': ['a','b','c'], 'ref': ['apple','orange','banana']}
df1 = pd.DataFrame(d1)

d2 = {'id': ['a','b','d'], 'ref': ['apple','orange','banana']}
df2 = pd.DataFrame(d2)

I want to see if the column pair of id and ref in df1 exists in df2.我想看看df1中的id和ref的列对是否存在于df2中。 I'd like to create a boolean column in df2 to accomplish this.我想在 df2 中创建一个 boolean 列来完成此操作。

Desired Output:所需的 Output:

d3 = {'id': ['a','b','d'], 'ref': ['apple','orange','banana'], 'check':[True,True,False]}
df2 = pd.DataFrame(d3)

I've tried the following along with a simple assign/isin我已经尝试了以下以及简单的分配/isin

df2['check'] = df2[['id','ref']].isin(df1[['id','ref']].values.ravel()).any(axis=1)

df2['check'] = df2.apply(lambda x: x.isin(df1.stack())).any(axis=1)

How can I do this WITHOUT a merge?我怎么能在没有合并的情况下做到这一点?

I'm not sure why you don't like merge, but you can use isin with tuple :我不确定您为什么不喜欢合并,但是您可以将isintuple一起使用:

df2['check'] = df2[['id','ref']].apply(tuple, axis=1)\
                  .isin(df1[['id','ref']].apply(tuple, axis=1))

Output: Output:

  id     ref  check
0  a   apple   True
1  b  orange   True
2  d  banana  False

I think this is what you're looking for:我想这就是你要找的:

d1 = {'id': ['a','b','c'], 'ref': ['apple','orange','banana']}
df1 = pd.DataFrame(d1)

d2 = {'id': ['a','b','d'], 'ref': ['apple','orange','banana']}
df2 = pd.DataFrame(d2)

result =  df1.loc[df1.id.isin(df2.id) & df2.ref.isin(df2.ref)]

although a merge would almost certainly be more efficient:尽管合并几乎肯定会更有效:

#create a compound key with id + ref
df1["key"] = df1.apply(lambda row: f'{row["id"]}_{row["ref"]}', axis=1)
df2["key"] = df2.apply(lambda row: f'{row["id"]}_{row["ref"]}', axis=1)
#merge df2 on df1 on compound key
df3 =  df1.merge(df2, on="key")
#locate the matched keys in df1
result = df1.set_index("id").loc[df3.id_x]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查一个数据框中的值是否存在于另一个数据框中并创建列 - Check if value from one dataframe exists in another dataframe and create column 检查一个 dataframe 是否存在于另一个中 - check if one dataframe exists in another 检查一列中的值是否存在于另一数据框中的多列中 - Check if values from one column, exists in multiple columns in another dataframe 熊猫-检查一个数据帧中的字符串列是否包含来自另一个数据帧的一对字符串 - Pandas - check if a string column in one dataframe contains a pair of strings from another dataframe 检查另一个数据框列中是否存在数据框列中的少数值 - To check if few values in dataframe column exists in another dataframe column 检查来自一个 dataframe 的文本是否存在于另一个 dataframe Python - Check if text from one dataframe exists in another dataframe Python 检查一个数据框中的值是否存在于另一个数据框中 - Check if value from one dataframe exists in another dataframe 检查 PySaprk 列值是否存在于另一个 dataframe 列值中 - Check if PySaprk column values exists in another dataframe column values pandas dataframe 检查列是否包含存在于另一列中的字符串 - pandas dataframe check if column contains string that exists in another column 检查数据框中的值是否存在于每一行的另一列中 - Check if value in dataframe exists in another column for each row
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM