[英]Pandas DataFrame combine rows by column value, where rows can have NaNs
I have a Pandas DataFrame like the following:我有一个 Pandas DataFrame ,如下所示:
timestamp A B C D E F
0 1607594400000 83.69 NaN NaN NaN 1003.20 8.66
1 1607594400000 NaN 2.57 44.35 17.18 NaN NaN
2 1607595000000 83.07 NaN NaN NaN 1003.32 8.68
3 1607595000000 NaN 3.00 42.31 20.08 NaN NaN
.. ... ... ... ... ... ... ...
325 1607691600000 90.19 NaN NaN NaN 997.32 10.22
326 1607691600000 NaN 1.80 30.10 14.85 NaN NaN
328 1607692200000 NaN 1.60 26.06 12.78 NaN NaN
327 1607692200000 91.33 NaN NaN NaN 997.52 10.21
I need to combine the rows that have the same value for timestamp, where in the cases where there is nan-value
the value is maintained and in the cases where there is value-value
the average of the values is calculated.我需要合并具有相同时间戳值的行,其中在存在
nan-value
的情况下保持该值,在存在value-value
的情况下计算值的平均值。
I tried the solution of the following question but it is not exactly my situation and I don't know how to addapt it: pandas, combine rows based on certain column values and NAN我尝试了以下问题的解决方案,但这不完全是我的情况,我不知道如何添加它: pandas, combine rows based on certain column values and NAN
Just use groupby
:只需使用
groupby
:
df.groupby('timestamp', as_index=False).mean()
Try with first
, it will pick the not null value for each group first
尝试,它将为每个组选择 not null 值
out = df.groupby('timestamp', as_index=False).first()
Or或者
out = df.set_index('timestamp').mean(level=0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.