简体   繁体   中英

i want to match two dataframe columns in python

I have a two data frame df1 (35k record) and df2(100k records). In df1['col1'] and df2['col3'] i have unique id's. I want to match df1['col1'] with df2['col3']. If they match, I want to update df1 with one more column say df1['Match'] with value true and if not match, update with False value. I want to map this TRUE and False value against Matching and non-matching record only.

I am using .isin() function, I am getting the correct match and not match count but not able to map them correctly.

Match = df1['col1'].isin(df2['col3'])
df1['match'] = Match

I have also used merge function using by passing the parameter how=right but did not get the results.

Use df.loc indexing:

df1['Match'] = False
df1.loc[df1['col1'].isin(df2['col3']), 'Match'] = True

You can simply do as follows:

df1['Match'] = df1['col1'].isin(df2['col3'])

For instance:

import pandas as pd
data1 = [1,2,3,4,5]
data2 = [2,3,5]
df1 = pd.DataFrame(data1, columns=['a'])
df2 = pd.DataFrame(data2,columns=['c'])
print (df1)
print (df2)
df1['Match'] = df1['a'].isin(df2['c']) # if matches it returns True else False
print (df1)

Output:

  a
0  1
1  2
2  3
3  4
4  5

   c
0  2
1  3
2  5

   a  Match
0  1  False
1  2   True
2  3   True
3  4  False
4  5   True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM