I'm working on pandas project. I have two data frame similar to bellow
DF1 :
Data1 Data2 Data3
Head Cat Fire
Limbs Dog Snow
Eyes Fish Water
Mouth Dragon Air
DF2 :
Data1 Data2
Limbs Dog
Mouth Dragon
Head Cat
Based on the above Dataframe I need to compare both DF's and if the match is found I need to write "True" in a separate column else False
ex: lets say, I pick DF2 first row with combination (Limbs, Dog) this should be searched in DF1, as we can see the combination exits in the 2nd row, then write DF1's Data3 value "Snow" to the DF2 Data3 value. and also print "True" value in a new column if the match is found.
expected output
Data1 Data2 Data3 Data4
Limbs Dog Snow True
Mouth Dragon Air True
Head cat Fire True
Eyes Fish Water False
Currently, I have tried merging two dataframe
Current Code:
df3 = pd.merge(df, valid_req , on=['Data1','Data2' ])
df3
Data1 Data2 Data3
Limbs Dog Snow
Mouth Dragon Air
Head cat Fire
How can I achieve the expected output?
You can assign a temporary column to df2
and then merge
using how='left'
:
In [1665]: df2['tmp'] = 1
In [1668]: x = df1.merge(df2, on=['Data1', 'Data2'], how='left')
In [1667]: x
Out[1667]:
Data1 Data2 Data3 tmp
0 Head Cat Fire 1.0
1 Limbs Dog Snow 1.0
2 Eyes Fish Water NaN
3 Mouth Dragon Air 1.0
Finally, use numpy.where
to assign the new column Data4
based on if x['tmp'] == 1
then True
, else False
:
In [1668]: import numpy as np
In [1669]: x['Data4'] = np.where(x.tmp.eq(1), True, False)
Drop the unnecessary tmp
column using df.drop
. Then x
is your final output :
In [1671]: x.drop('tmp', 1, inplace=True)
In [1672]: x
Out[1672]:
Data1 Data2 Data3 Data4
0 Head Cat Fire True
1 Limbs Dog Snow True
2 Eyes Fish Water False
3 Mouth Dragon Air True
Use DataFrame.merge
with left join and indicator=True
parameter and then for new column compare by both
with DataFrame.pop
for remove column:
df = df1.merge(df2, on=['Data1', 'Data2'], how='left', indicator=True)
df['Data4'] = df.pop('_merge').eq('both')
print (df)
Data1 Data2 Data3 Data4
0 Head Cat Fire True
1 Limbs Dog Snow True
2 Eyes Fish Water False
3 Mouth Dragon Air True
Use simply the apply function on DF1 to create the Data4:
import pandas as pd
DF1 = pd.DataFrame([
["Head", "Cat", "Fire"],
["Limbs", "Dog", "Snow"],
["Eyes", "Fish", "Water"],
["Mouth", "Dragon", "Air"]
], columns=["Data1", "Data2", "Data3"])
DF2 = pd.DataFrame([
["Limbs", "Dog", "Snow"],
["Mouth", "Dragon", "Air"],
["Head", "Cat", "Fire"]
], columns=["Data1", "Data2", "Data3"])
DF1["Data4"] = DF1["Data1"].apply(lambda cell: DF2[DF2["Data1"]==cell]["Data1"].count()>0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.