[英]Merge rows inside dataframe by replacing nans in different columns
I have a df: 我有一个df:
df = pd.DataFrame([[1, np.nan, "filled", 3], [1, "filled", np.nan, 3], [1, "filled", np.nan, 4]], columns = ["a", "b", "c", "d"])
a b c d
0 1 NaN filled 3
1 1 filled NaN 3
2 1 filled NaN 4
And my end result should be: 我的最终结果应该是:
df = pd.DataFrame([[1, "filled", "filled", 3], [1, "filled", np.nan, 4]], columns = ["a", "b", "c", "d"])
a b c d
0 1 filled filled 3
1 1 filled NaN 4
So I want to merge the rows that are identical in all respects other than the column b and c. 所以我想合并除了列b和c以外的所有方面相同的行。 The issue is that not always there will be a another row identical except for columns b and c.
问题是除了列b和c之外,并不总是会有另一行相同。
Can't think how to use df.groupby(["a", "d"]).apply()
to get what I want. 想不出怎么用
df.groupby(["a", "d"]).apply()
得到我想要的东西。
You can check with groupby
+ first
, it will select the first not NaN
value as output 您可以检查
groupby
+ first
,它会选择先不要NaN
值作为输出
df.groupby(['a','d'],as_index=False).first()
Out[897]:
a d b c
0 1 3 filled filled
1 1 4 filled NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.