I have two dataframes with 3 columns each
df.A= x:1,4
y:2,5
Z: ,6
df.B= x:1,4
y:2,5
C: ,6
My output dataframe after joining the above two dataframes
df[C]=A.merge(B,left_on=['X','Y','Z'],right_on=['X','Y','C'],how='left')
I am only getting the 2 row values but not the first row
df[C]=x:4
y:5
z:6
How can I deal with these missing values scenarios with joins, let me know if you are not able to understand the question
import pandas as pd
import numpy as np
dfA = pd.DataFrame({"X":[1,4],
"Y":[2,5],
"Z":['',6]})
print(dfA)
dfB = pd.DataFrame({"X":[1,4],
"Y":[2,5],
"C":['',6]})
print(dfB)
dfC=dfA.merge(dfB, left_on=['X','Y','Z'], right_on=['X','Y','C'], how='left')
print(dfC)
Outputs:
X Y Z
0 1 2
1 4 5 6
X Y C
0 1 2
1 4 5 6
X Y Z C
0 1 2
1 4 5 6 6
You might have whitespace issues, where one dataframe has a single space an the other has two or more spaces.
I don't quite understand... You code appears to be working as expected. Here is MCVE:
import pandas as pd
import numpy as np
dfA = pd.DataFrame({"X":[1,4],
"Y":[2,5],
"Z":[np.nan,6]})
print(dfA)
Output:
X Y Z
0 1 2 NaN
1 4 5 6.0
And,
dfB = pd.DataFrame({"X":[1,4],
"Y":[2,5],
"C":[np.nan,6]})
print(dfB)
Output:
X Y C
0 1 2 NaN
1 4 5 6.0
Merge left yeilds, dfC:
dfC=dfA.merge(dfB, left_on=['X','Y','Z'], right_on=['X','Y','C'], how='left')
print(dfC)
Output:
X Y Z C
0 1 2 NaN NaN
1 4 5 6.0 6.0
Were you expecting something different?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.