I simply merge two dataframes in common column:
df1
email account
0 555@i555.com 555
1 666@666.com 666
2 777@666.com Nan
3 888@666.com 999
df2 (i think ip is index here)
ip account
1.1.1.1 555
2.2.2.2 666
.
.
df3= pd.merge(df1,df2,on='accountname')
in this case, I have missing data. How can I avoid this?
pd.merge(df1,df2,on='accountname',how='left')
Or
pd.merge(df1,df2,on='accountname',how='inner')
EDIT : Let us see your sample data, you merge
str with int. that why all NaN
df1.applymap(type)
Out[96]:
email account
0 <class 'str'> <class 'str'>
1 <class 'str'> <class 'str'>
2 <class 'str'> <class 'str'>
3 <class 'str'> <class 'str'>
df2.applymap(type)
Out[97]:
account
ip
1.1.1.1 <class 'int'>
2.2.2.2 <class 'int'>
How to do that:
Option1
Change str
to numeric
by using pd.to_numeric
df1.account=pd.to_numeric(df1.account,errors ='coerce')
df1.applymap(type)
Out[99]:
email account
0 <class 'str'> <class 'float'>
1 <class 'str'> <class 'float'>
2 <class 'str'> <class 'float'>
3 <class 'str'> <class 'float'>
df1.merge(df2.reset_index(),on=['account'],how='left')
Out[101]:
email account ip
0 555@i555.com 555 1.1.1.1
1 666@666.com 666 2.2.2.2
2 777@666.com NaN NaN
3 888@666.com 999 NaN
Option 2
We just change the df2.account
to str
(I prefer using the first pd.to-numeric
)
df2.account=df2.account.astype(str)
df1.merge(df2.reset_index(),on=['account'],how='left')
Out[105]:
email account ip
0 555@i555.com 555 1.1.1.1
1 666@666.com 666 2.2.2.2
2 777@666.com Nan NaN
3 888@666.com 999 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.