简体   繁体   中英

Compare Two Columns in Two Pandas Dataframes

I have two pandas dataframes:

df1:
a   b   c
1   1   2
2   1   2
3   1   3

df2:
a   b   c 
4   0   2 
5   5   2 
1   1   2 

df1 = {'a': [1, 2, 3], 'b': [1, 1, 1], 'c': [2, 2, 3]}
df2 = {'a': [4, 5, 6], 'b': [0, 5, 1], 'c': [2, 2, 2]}
df1= pd.DataFrame(df1)
df2 = pd.DataFrame(df2)

I'm looking for a function that will display whether df1 and df2 contain the same value in column a .

In the example I provided df1.a and df2.a both have a=1 .

If df1 and df2 do not have an entry where the the value in column a are equal then the function should return None or False .

How do I do this? I've tried a couple combinations of panda.merge

You could use set intersection:

def col_intersect(df1, df2, col='a'):
    s1 = set(df1[col])
    s2 = set(df2[col])
    return s1 & s2 else None

Using merge as you tried, you could try this:

def col_match(df1, df2, col='a'):
    merged = df1.merge(df2, how='inner', on=col)
    if len(merged):
        return merged[col]
    else:
        return None

Define your own function by using isin and any

def yourf(x,y):
    if any(x.isin(y)):
        #print(x[x.isin(y)])
        return x[x.isin(y)]
    else: 
        return 'No match' # you can change here to None

Out[316]: 
0    1
Name: a, dtype: int64

yourf(df1.b,df2.c)
Out[318]: 'No match'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM