[英]Merge returns nan in Pandas
I want to merge df1
and df2
on common column ID
.我想在公共列
ID
上合并df1
和df2
。 df2
looks like this: df2
看起来像这样:
ID TYPE VALUE1 VALUE2 VALUE3
0 672117 Single 0.25 923.77 94.08
1 MSN242 DOUBLE 0.21 1219.31 105.77
2 673312 DOUBLE 0.20 4030.08 113.00
3 222255 Single 0.23 1119.38 126.69
code used:使用的代码:
df3 = pd.merge(df1, df2, on ='ID', how = 'left')
It seems there are 2510 common ID (all ID matched):好像有2510个通用ID(所有ID都匹配):
len(list(set(df1.ID) and set(df2.ID)))
>>> 2510
but df3
shows that all columns TYPE
, VALUE1
, VALUE2
, VALUE3
are mostly nan
.但
df3
显示所有列TYPE
, VALUE1
, VALUE2
, VALUE3
大多是nan
。 What went wrong?什么地方出了错?
Edit: df1
(shape 2510 rows × 22 columns
) looks like this:编辑:
df1
(形状2510 rows × 22 columns
)看起来像这样:
ID CRITERION1 DATE MEAS1 MEAS2 MEAS3 COMPOSITION DPMT %CONTENT1 %CONTENT2 MeanGroup %Article1 %CA_Count %CA_Count1 CATEGORY1 CATEGORY2 CODE Group COST1 COST2 COST3 COST4
0 000002 Y 2009-01-03 11:52:46 0.930150 17.412708 1.583333 Component P 0.407859 0.979346 C 0.401572 0.000098 0.946168 Z L LEVEL1 NY 1767.0 1767.0 1767.0 1767.0
1 XC-004 Y 2009-01-03 11:52:46 1.898295 0.548192 0.250000 Component NP 0.874263 0.999742 C 0.797250 0.000015 0.995345 Z M LEVEL1 NU 15525.0 15525.0 15525.0 15525.0
Since you merged (joined) left, it'll keep all IDs from the left table (df1) and drop all non-matching from df2.由于您向左合并(加入),它将保留左表(df1)中的所有 ID,并从 df2 中删除所有不匹配的 ID。 It then fills up all the non-existing VALUE1, VALUE2, VALUE3 from the IDs that are left-only with NaNs.
然后,它从只剩下 NaN 的 ID 中填充所有不存在的 VALUE1、VALUE2、VALUE3。
I'd assume your ID mismatch is pretty large and you have len(df1.ID) - 2510
number of NaNs rows in your table.我假设您的 ID 不匹配非常大,并且您的表中有
len(df1.ID) - 2510
个 NaN 行。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.