简体   繁体   中英

Missing data in Pandas Merge

I simply merge two dataframes in common column:

df1

                  email                           account 
0                 555@i555.com                    555
1                 666@666.com                     666
2                 777@666.com                     Nan
3                 888@666.com                     999


df2 (i think ip is index here)

ip                account   
1.1.1.1           555
2.2.2.2           666
.
. 


df3= pd.merge(df1,df2,on='accountname')

in this case, I have missing data. How can I avoid this?

pd.merge(df1,df2,on='accountname',how='left')

Or

pd.merge(df1,df2,on='accountname',how='inner')

EDIT : Let us see your sample data, you merge str with int. that why all NaN

df1.applymap(type)
Out[96]: 
           email        account
0  <class 'str'>  <class 'str'>
1  <class 'str'>  <class 'str'>
2  <class 'str'>  <class 'str'>
3  <class 'str'>  <class 'str'>
df2.applymap(type)
Out[97]: 
               account
ip                    
1.1.1.1  <class 'int'>
2.2.2.2  <class 'int'>

How to do that:

Option1

Change str to numeric by using pd.to_numeric

df1.account=pd.to_numeric(df1.account,errors ='coerce')
df1.applymap(type)
Out[99]: 
           email          account
0  <class 'str'>  <class 'float'>
1  <class 'str'>  <class 'float'>
2  <class 'str'>  <class 'float'>
3  <class 'str'>  <class 'float'>

df1.merge(df2.reset_index(),on=['account'],how='left')


Out[101]: 
          email account       ip
0  555@i555.com     555  1.1.1.1
1   666@666.com     666  2.2.2.2
2   777@666.com     NaN      NaN
3   888@666.com     999      NaN

Option 2

We just change the df2.account to str (I prefer using the first pd.to-numeric )

df2.account=df2.account.astype(str)
df1.merge(df2.reset_index(),on=['account'],how='left')
Out[105]: 
          email account       ip
0  555@i555.com     555  1.1.1.1
1   666@666.com     666  2.2.2.2
2   777@666.com     Nan      NaN
3   888@666.com     999      NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM